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Cover Design 

Dithering and color space conversion are two 
of the concepts discussed in "Video Rendering" 
which opens this issue's set of papers on multi- 
media technologies. On the cover, the band 
of blue across the bottom of the cover graphic 
shows the rectangular patterning created 
by an ordered dither process using a popular 
recursive tessellation array. The band of bur- 
gundy across the top shows the superior pat- 
terning of the same ordered dithei' process 
with a newly designed void- and- cluster array, 
which produces a higher quality image for dis- 
play by eliminating the rectangular patterns 
and the textures of white noise. The line illus- 
tration overlaying these two arrays presents 
two color spaces, one within the other: RGB 
and YUV (luminance- chrominance space used 
by television systems; Y axis notshmwn). In the 
color conversion process, data transmitted 
in YUV space is con verted to ROB space. The 
co ver design sho ws three faces of the RGB 
space "lifted off and infused with the colors 
noted at each comer of the parallelepiped. 

The cover concept and illustrations are 
derived from the paper "Video Rendering" 
by Bob Ulichney. The design was imple- 
mented by Linda Falvella of Quart tic 
Communications, Inc. 
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Editor's Introduction 



Jane C. Blake 

Managing Editor 

This issue of the Digital Technical Journal features 
papers on multimedia technologies and applica- 
tions, and on uses of the Application Control 
Architecture (ACA), Digital's implementation of the 
Object Management Group's CORBA specification. 

The high quality of today's television, film, and 
sound recordings have set expectations for com- 
puter-based multimedia; we expect high-quality 
images, fast response times, good quality audio, 
availability — including network transmission, and 
all at "reasonable" cost. Bob Ulichney has written 
about video image-rendering methods that are in 
fact fast, simple, and inexpensive to implement. He 
reviews a color rendering system and compares 
techniques that address the problem of insufficient 
colors for displaying video images. Dithering is one 
of these techniques, and he describes a new algo- 
rithm which provides good quality color and high- 
speed image rendering. 

The dithering algorithm is utilized in Software 
Motion Pictures. SMP is a method for generating 
digital video on desktop systems without the need 
for expensive decompression hardware. Burkhard 
Neidecker-Lutz and Bob Ulichney discuss issues 
encountered in designing portable video compres- 
sion software to display digital video on a range of 
display types. SMP has been ported to Alpha AXP, 
Sun, IBM, Hewlett-Packard, and Microsoft platforms. 

Digitized data — video or audio — must be com- 
pressed for efficient storage and transmission. 
Davis Pan surveys audio compression techniques, 
beginning with analog-to-digital conversion and 
data compression. He then discusses the Motion 
Picture Experts audio algorithm and the interesting 
problem of developing a real-time software imple- 
mentation of this algorithm. 

Even compressed, digitized data takes up tre- 
mendous amounts of storage space. A relational 



database can not only store this data but provide 
fast retrieval. Mark Riley, Jay Feenan, John Janosik, 
and T.K. Rengarajan describe DEC Rdb enhance- 
ments that support multimedia objects, i.e., text, 
still frame images, compound documents, and 
binary large objects. 

Managing image documents is the subject of a 
paper by Jan te Kiefte, Bob Hasenaar, Joop Mevius, 
and Theo van Hunnik. Megadoc is a hardware and 
software framework for building customized image 
management applications quickly and at low cost. 
They describe the UNIX file system interface to 
WORM drives, a storage manager, and an image 
application framework. 

Distributing multimedia over a network presents 
both engineering challenges and opportunities for 
applications. DECspin is a real-time, desktop video- 
conferencing application that operates over LANs or 
WANs, using TCP/IP or DECnet protocols. Larry and 
Ricky Palmer present an overview of the DECspin 
graphical interface. They then address network 
issues of real-time conferencing on non-real-time 
networks and a solution to network congestion. 

The transmission of full-motion video programs 
to multiple users requires adaptations in many 
parts of a client-server, LAN environment. Peter 
Hayden's paper focuses on the specific problem of 
efficient allocation of network addresses for the 
transmission of digital video data on a LAN. He 
reviews alternatives and describes a technique for 
the dynamic allocation of multicast addresses. 

The common theme of the two final papers is 
ACA Services, Digital's implementation of the OMG's 
Common Object Request Broker Architecture. Paul 
Patrick has written an instructive paper on CASE 
environment development utilizing ACA. Assuming 
a multivendor, distributed environment, he dis- 
cusses modeling of applications, data, and opera- 
tions; application interfacing; and environment 
management. 

DEC ©aGlance software is an implementation of 
ACA that supports the integration of manufacturing 
process information systems. David Ascher differ- 
entiates between generic integration software and 
@aGlance, and describes how ACA is used to inte- 
grate independently developed applications. 

The editors thank John Morse, engineering man- 
ager, Corporate Research, and Mary Ann Slavin, 
engineering manager, ACA, for their help in prepar- 
ing this issue. 
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In the late '80s, "multimedia" was a magic word. 
It seduced us with glimpses of a brave new world 
where audio and video technology merged with 
computer technology. It promised us everything 
from instant high-impact business presentations 
to virtual reality. Words like "paradigm shift" and 
"multibillion-dollar industry" were enough to snare 
both the technophiles and the eager entrepreneurs 
into believing that the world had suddenly 
changed, and we were all going to get rich in the 
process. 

Somewhere on the way to the bank, reality set in, 
and it wasn 't virtual. The reality is that multimedia is 
a lot harder than it looks. Successful multimedia 
requires a marriage between analog TV technology 
and digital computer technology; it requires recon- 
ciliation between a technical/professional market- 
place and a consumer marketplace. As in any 
marriage, a lot of hard work is required to make it 
succeed, and much of that work is yet to be done. 

For certain segments of the computer industry, 
multimedia was relatively easy to implement and so 
caught on quickly. The first successes have been at 
the extremes of the cost spectrum — very low-end 
desktop multimedia on the one hand, and very 
high-end virtual reality systems on the other. This 
has left Digital, with its traditional focus on the 
middle, temporarily out of the game. 

For desktop multimedia, all that is required is the 
ability to capture and display video and audio. Since 
machines like the Commodore Amiga were already 
based more on TV technology than on computer 
technology (for cost reasons), t hey could be quickly 
and cheaply adapted to handle audio and motion 
video. Thus desktop multimedia was born. The 



CD-ROM, adapted from audio CD technology, was the 
perfect storage medium for distribution of multi- 
media content; and so for this market segment, 
CD-ROiM and multimedia became almost synonymous. 
There has emerged a whole industry based around 
the production of multimedia titles on CD-ROM. 

At the high end, for purposes such as full-realism 
aircraft simulation or virtual reality applications, 
the solution was to use the highest performance 
hardware available, at whatever expense. Typically, 
high-end, three-dimensional graphics systems were 
coupled either to supercomputers or to massively 
parallel processor arrays. The result was, and still is, 
impressive. But the cost is still so high that such vir- 
tual reality systems are not yet commercially viable 
except in specialized low- volume markets. 

The vast area in the middle, into which all of 
Digital's business falls, has developed very slowly. 
The problem is that our business is based on a 
model of enterprise-wide computation. The com- 
puter systems we design and sell not only include 
processors and displays but incorporate networks 
and servers as well. To introduce multimedia into 
such a model, one touches every aspect of the sys- 
tem, from the desktop, through the network, and 
back to the servers. At every turn, we have found 
that the technology that has evolved over 30 to 40 
years for handling numbers, text, and (more 
recently) two-dimensional and three-dimensional 
graphics is not quite right for video and audio. 
Every component of the system, both hardware and 
software, needs to change in some way We need to 
evolve to a model of networked client-server multi- 
media computing. Change of this magnitude is a 
slow process. 

Two challenges are so pervasive that almost 
every paper in this issue addresses them, each from 
a different perspective. First of all, multimedia 
involves the handling of large quantities of data. 
Second, for many applications, that data must be 
handled under very tight time constraints. The 
resulting stress and strain on all components of the 
system translates into a set of technical challenges 
that has occupied us for the last four years and 
promises to keep us busy through at least the rest of 
this decade. 

Depending on the picture quality chosen, it may 
require from one million to one hundred million 
bytes of storage to save each second of live video in 
digital form. Since many applications of multi- 
media, such as archiving television footage for 
research or historic preservation purposes, will 
need to save many hours of video, it is easy to see 
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that multimedia quickly builds demand for many 
gigabytes (1,000,000,000 bytes) of magnetic or opti- 
cal disk storage. But storage is only part of the prob- 
lem . Once such enormous amounts of data are 
stored, the challenge becomes how to retrieve a 
particular item of interest. Standard database tech- 
niques are oriented toward retrieval of text and 
numbers. Retrieval of audio and video information 
will require new file and database techniques that 
are only beginning to be understood. 

An obvious application of multimedia technol- 
ogy, once the networks are in place, is telecon- 
ferencing. We can envision a day when we can 
connect to anyone any place in the world via the 
network and carry on a conversation with them, 
while each of us sees the other in full-motion video, 
using the audio and video capabilities of our desk- 
top workstations and PCs. But realizing this vision 



has proved surprisingly hard. People expect the 
images they see to be synchronized with the sounds 
they hear, and they expect delays to be no worse 
than those experienced on a long-distance tele- 
phone call. Unfortunately, data networks have been 
designed to maximize throughput and reliability. 
They do this at the expense of some delay in trans- 
mission — delay that is annoying at best, and unac- 
ceptable at worst, for teleconferencing applications. 

Successful infusion of multimedia technology 
into enterprise-wide computation is proving to 
require change on a scale that almost no one antici- 
pated. We at Digital are in the midst of this process 
of change, and this issue of the Digital Technical 
Journal is a snapshot, taken at one point in time, of 
that process. Together, the papers describe some of 
the toughest technical challenges that we face and 
in many cases give glimpses into possible solutions. 
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Video Rendering 

Video rendering, the process of generating device-dependent pixel data from 
device-independent sampled image data, is key to image quality. System compo- 
nents include scaling, color adjustment, quantization, and color space conversion. 
This paper emphasizes methods that yield high image quality, are fast, and yet are 
simple and inexpensive to implement. Particular attention is placed on the deriva- 
tion and analysis of new multilevel dithering schemes. Wlrile permitting smaller 
frame buffers, dithering also provides faster- transport of the processed image to the 
display— a key) benefit for the massive pixel rates associated with full- motion video. 



Perhaps the most influential characteristic govern- 
ing the perceived value of a system that displays 
images is the way the pictures look. Image appear- 
ance is largely dependent upon the quality of render- 
ing, that is, the process of taking device-independent 
data and generating device-dependent data tailored 
for a particular target display. 

The topic of this paper is the processing of sam- 
pled image data and not synthetic graphics. For 
graphics rendering, primitives such as specifica- 
tions of triangles are converted to displayable pic- 
ture elements or pixels. The atomic elements 
handled by a video rendering system are device- 
independent pixels. Whereas a prerendered graph- 
ics image can be compactly represented as a 
collection of triangle vertices, prerendered video 
achieves compaction by means of compression 
techniques. 

Sampling broadcast video requires a data rate of 
more than 9 million color pixels per second; the 
need of some relief for storage and networks is 
clear. Video compression reduces redundancy in 
the source image and thereby reduces the amount 
of data to be transmitted. Dramatic reductions in 
data rate can be achieved with little degradation in 
image quality The Joint Photographic Experts 
Group (JPEG) standard for still frame and the 
Motion Picture Experts Group (MPEG) and Px64 
standards for motion video are current committee 
compression techniques. 1 Several other non- 
standard schemes exist, including a simple com- 
pression method conducive to software-only 
implementation, 2 

Video rendering receives decompressed image 
data as input. Since every decompressed pixel must 
be processed, speed is essential. This paper focuses 



on rendering methods that are fast, simple, and 
inexpensive to implement. Performance at video 
rates can be achieved with minimal hardware or 
even software-only solutions. 

The Rendering Architecture section reviews the 
components of a rendering system and examines 
design trade-offs. The paper then presents details 
of new and efficient dithering implementations. 
Finally, video color mapping is discussed. 

Rendering Architecture 

Figure 1 illustrates the major phases of a video ren- 
dering system: (1) filter and scale, (2) color adjust, 
(3) quantize, and (4) color space convert. 

In the first stage, the original image data must be 
resampled to match the target window size. A sepa- 
rate scaling system should be used for the horizon- 
tal and vertical directions to handle the case where 
the pixel aspect ratio must be changed. For exam- 
ple, such asymmetric scaling is needed when the 
target display expects square pixels and the original 
pixels are not square. 

The best filters to use in combination with scal- 
ing have been determined from a perceptual point 
of view.^ When limiting the bandwidth to reduce 
the data rate, a Gaussian filter with a standard devi- 
ation cr 0.30 X output period is recommended. 
For interpolation, the filter preferred (because the 
filtered results looked most like the original) was 
a cascade of two: first, sharpen with a Laplacian 
filter, and second, follow by convolution with a 
Gaussian filter with a = 0.375 x input period. 

A typical sharpening scheme can be expressed 
by the following equation: 

/ sharp tej] = r[x,y] - p*[x,y] *i[x,y], (l) 
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Figure 1 Image Rendering System 

where I[x,y] is the input image, ty[x, y] is a digital 
Laplacian filter, and is the convolution operator. 4 
The nonnegative parameter P controls the degree 
of sharpness, with P = 0 indicating no change in 
sharpness. When enlarging, sharpening should 
occur before scaling, and when reducing, sharpen- 
ing should take place after scaling. The filtering dis- 
cussed here is assumed to be two-dimensional, 
which requires image line buffering. For economy, 
horizontal-only filtering is sometimes used. 

The simplest means of scaling is known as 
nearest-neighbor scaling, and its simplest imple- 
mentation is based on the Bresenham scan conver- 
sion algorithm for drawing straight lines. s This 
algorithm can be applied to image scaling and per- 
formed with only three registers and one adder. 6 
Further optimizations make this algorithm espe- 
cially suited for real-time use.^ 

The second stage of rendering is color adjust, 
most easily achieved with a look-up table (LUT). 
Each color component uses a separate adjust LUT. 
In the case of a luminance-chrominance color, an 
adjust LUT for the luminance component controls 
contrast and brightness, and LUTs for the chromi- 
nance components control saturation. 

For so-called true-color frame buffers with 24-bit 
depths, visual artifacts that can result from insuffi- 
cient amplitude resolution do not occur. With 
smaller frame buffers, restricting the amplitude of 



the color components red, green, and blue (RGB) 
with a simple uniform quantizer causes false con- 
tours to appear in slowly varying regions. This issue 
leads to the third stage in the rendering system, 
quantization. 

The three basic classes of techniques for cir- 
cumventing the problem of insufficient colors or 
color memory are (I) histogram-based methods, 

(2) chrominance-subsampled frame buffers, and 

(3) dithering. All histogram-based methods, some- 
times called palette selection, require two passes of 
the entire image data: the first to acquire the his- 
togram statistics to fabricate a three-dimensional 
quantizer to A' colors and the second to perform 
the pixel assignments. Perhaps the fastest method 
is the popularity algorithm, where a simple sort 
finds the N colors with the highest frequency, and 
all other colors are mapped to those. 8 

A more compute-intensive method, but one that 
in general performs much better, is the often-used, 
median-cut algorithm » In this method, the color 
space is repeatedly subdivided into smaller rectan- 
gular solids at the median planes, with the goal that 
each of the selected colors represent an equal num- 
ber of colors in the image. The average of the colors 
in each of the final regions is the color used in the 
quantizer. A later, less compute-intensive variation 
is the mean-split algorithm. Also, several clustering 
techniques have been reported that result in less 
quantization error than the above-mentioned meth- 
ods. One method, for example, minimizes the sum 
of the squares of the errors. 9 In all cases, however, 
color problems can occur in other application win- 
dows because each frame requires a d ifferent color 
map; the colors in the other windows become 
scrambled in a different way for each color map. 

One advantage of representing image data in a 
luminance-chrominance space is that chrominance 
requires less spatial resolution than luminance to 
achieve excellent image quality. Visual perception 
of differences in chrominance is much less than that 
for luminance. The television standards have been 
exploiting this fact for decades. The quantization 
approach of using chrominance-subsampled frame 
buffers is built on this fact, deferring conversion to 
the RGB components until just after the data is read 
for display. 101112 

Typical implementations of chrominance- 
subsampled frame buffers average each of the 
two chrominance values in a given luminance- 
chrominance color representation over a region 
that is either 2 by 2 or 4 by 4 pixels. Assuming 8 bits 
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of amplitude resolution per color component, the 
2-by-2-pixelcaseresults in an average of ((2 X 2 X 8 
luminance bits) + (8 + 8 chrominance bits))/ 
(2X2 pixels) or 12 bits per pixel; similarly, the 
4-by-4-pixel case results in 9 bits per pixel. This 
approach requires expensive hardware to up-sam- 
ple the chrominance components and convert the 
color space at video rates. These nonstandard 
frame buffers can also cause severe incompatibility 
problems with most applications that expect RGB 
frame buffers. While chrominance subsampled 
frame buffers can accommodate most sampled nat- 
ural images, thin-line graphics can be annihilated. 

The third alternative for quantization is to use 
a dithering method. Several methods exist that are 
designed primarily for binary output, but all are 
extendable to multilevel color. ,KVlih A "level" is a 
shade of gray, from black to white, or a shade of 
a color component, from black to the brightest 
value. The basic principle of dithering is to use the 
available subset of colors to produce, by judicious 
arrangement, the illusion of any color in between. 

Although neighborhood operations, most notably 
error diffusion, produce good-quality dithering, 
they are computationally complex and require 
additional storage. For video processing, where 
speed is essential, we turned our focus to those 
dithering methods that are point operations, that is, 
methods that operate on the current pixeJ only 
without considering its neighbors. Each color com- 
ponent of every pixel in the image has an associated 
"noise" or dither amplitude that is added to it before 
that component is passed to a uniform quantizer. 

Historically, the first dithering method used for 
video processing was white noise dithering, where 
a pseudorandom number was added to each lumi- 
nance value before quantization. This method was 
practiced soon after the dawn of television. 16 
However, the low-frequency energy in white noise 
causes undesirable textures and graininess. 

A preferred method is the point process of 
ordered dithering, where a deterministic noise 
array tiles the plane in a periodic manner. Dither 
arrays can be designed to minimize low-frequency 
texture. The most popular are the so-called recur- 
sive tessellation arrays. 17 18 These arrays yield results 
superior to those of white noise dithering but suf- 
fer from structured rectangular patterns. 

A new ordered dither array design, called the 
"void-and-cluster" method, eliminates both the low- 
frequency textures of white noise and the rectangu- 
lar patterns of recursive tessellation arrays. 19 The 



name describes the dither array design process in 
which voids and clusters are located and mitigated. 

For the high-speed case of motion video, an 
ordered dithering scheme has important advan- 
tages over chrominance-subsampled frame buffers 
and histogram-based approaches. Quantization by 
dithering allows the use of conventional frame 
buffers, does not require the time-consuming pro- 
cess of making two passes over each frame (or 
every TVth frame), does not cause other applications 
to change color maps with every TVth frame, and 
allows any number of colors to be selected at ren- 
der time. Also, experiments have shown that the 
image quality achieved by dithering is very compet- 
itive with the other methods, when compared over 
a range of sample images. Even when 24-bit frame 
buffers are available, the increased speed of loading 
three or four 8-bit color pointers or index values in 
the time required to load a single 24-bit pixel makes 
dithering a viable alternative in the design of desk- 
top video systems. 

By way of comparison, Figure 2 illustrates some 
of the methods described in this section. A 240-by- 
360-pixel, 8-bit monochrome image was rendered 
to only two levels and displayed at 100 dots per inch 
(dpi). Figure 2a depicts an image that was dithered 
with white noise; in Figure 2b, the same image was 
dithered using an 8-by-8 recursive tessellation 
dither array; and Figure 2c shows the image 
dithered with the new 32-by-32 void-and-cluster 
array. To illustrate the effect of sharpening, Figure 
2d shows the image in Figure 2c presharpened 
using a digital Laplacian filter as in equation (1), 
with a sharpening factor of p = 2.0. The goal of this 
coarse example is to amplify the different effects. 
The same methods apply to multilevel and color 
output, where the resulting quality is much higher. 

Fast Multilevel Dithering 

This section presents the details of simple, yet pow- 
erful new designs to perform multilevel ordered 
dithering. The simplicity of these methods allows 
for implementation with minimum hardware or 
software only, yet guarantees output that preserves 
the mean of the input. The designs are flexible in 
that they allow dithering from any number of input 
levels N if to any number of output levels Af , pro- 
vided N. > N (} . Note that iV and I\[ > are not restricted 
to be powers of two. 

Each color component of a color image is treated 
as an independent image. The input image Z / . can 
have values 
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(a) Dither ivitb a While Noise Threshold 




(c) Dither with a 32-by-32 Void-and-c luster 
Threshold A rray 

Figure 2 Examples o/R 

L.E {0,1,2,..., (/V. - 1)), 

and the output image L a can have values 

L o e {0,1,2,..., (/V - 1)). 

A deterministic dither array of size/W X /V is used 
that is periodic and tiles the entire input image. To 
simplify addressing of this array, M and /V should 
each be powers of two. A dither template defines 
the order in which dither values are arranged. The 
elements of the dither template T have values 

re (0,1,2,..., (N t - i)) ; 




(b) Dither with an 8-by-8 Recursive 
Tessellation Threshold Array 




(d) Same as (c) with Laplacian Sharpening, 
(3 =2.0 

*ing to Two Output Levels 

where AJ is the number of template levels, which 
represent the levels against which image input 
values are compared to determine their mapping 
to the output values. The dither template is thus 
central to determining the nature of the resulting 
dither patterns. 

Figure 3 shows a dithering system that comprises 
two memories and an adder. The system takes an 
input level Z.at image location [x,y] and produces 
output level L a at the corresponding location in the 
dithered output image. The dither array is addressed 
by x' and y\ which represent the low- order bits of 
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d[x',y] 



Lj[x,y|- 



j s[x,y] 


QUANTIZER 






LUT 





Figure 3 Dithering System with Two L UTs 

the image address. The selected dither value 
d\x\y'\ is added to the input level to produce the 
sum 5. This sum is then quantized by addressing a 
quantizer LUT to produce the output level L () . 

The trick to achieving mean-preserving dithering 
is to properly generate the LUT values. The dither 
array is a normalized version of the dither template 
specified as follows: 

dlxly*] - intfA/r[<y'] + ^)), (2) 

where A^, the step size between normalized dither 
values, is defined as 



A. = 



(3) 



and A 0 is the quantizer step size 
« {N 0 ~ 1) 



(4) 



Note that A 0 also defines the range of dither values. 
The quantizer LUT is a uniform quantizer with N Q 
equal steps of size A r; . 

The precise expressions in equations (2), (3), and 
(4) were arrived at through extensive analysis of the 
average output resulting from processing input 
images of a constant value, over a wide range of N p 
/V, and AJ. 



One-memory Dithering System 
Using the above expressions, it is possible to sim- 
plify the system by exchanging one degree of free- 
dom for another. A bit shifter can replace the 
quantizer LUT at the expense of forcing the number 
of input levels /V, to be set by the system. For hard- 
ware implementations, this design affords a consid- 
erable cost reduction. 

The system and method of Figure 3 assume that 
yV. is given as a fixed parameter, as is usually the case 
with most imaging systems and file formats. 
However, for image sources such as hardware that 
generates synthetic graphics, arbitrarily setting N f 
often has no effect on the amount of computation 
involved. If an adjust LUT is used to modify the 
image data, including a gain makes a "modified 
adjust LUT." Figure 4 depicts such a system, where 
Z. is the raw input level. The unadjusted or raw 
input image can have the values 

L r E {0,1,2,. ..,(#,.- 1)), 

where N r is the number of raw input levels, typi- 
cally 256. Therefore, the modified adjust LUT must 
impart a gain of 

N t ~ 1 



N - 1 ' 



To solve for N r recall that in the method of Figure 
3 the quantizer was defined to have equal steps of 
size A^ as defined in equation (4). The quantizer 
LUT can be replaced by a simple R-bit shifter, if the 
variable A^ can be forced to be an exact binary 
number, 



A = 2 R 

N f can then be set by the expression 
yV = (yV - 1)2* + 1. 



(5) 



(6) 



The integer R is the number of bits the R-bit 
shifter shifts to the right to achieve quantization. 
Specifying R in terms of A ; o , equation (6) becomes 



L r [x-y]- 



DITHER 
ARRAY 



MODIFIED 




ADJUST 




LUT 





Lj[x,yl- 



^0 



d[x',y'] 
s[x.y] 



R-BIT 
SHIFTER 



L 0 [x,y] 



Figure 4 One -memory Dithering System with an Adjust L UT and Bit Shifter 
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R = log. 



0\- 1) 



2 (A" " 1) 



(7) 



To completely specify this problem requires speci- 
fying the range for N.. It is reasonable to do this by 
specifying the number of bits b by which the image 
input values are to be represented. Specifying b 
limits N. to the range 



W < 2 . 



(8) 



Parameter b will be a key value in specifying the 
resulting system. 

Given the two expressions, (7) and (8), and the 
two unknowns, R and N r a unique solution exists 
because the range of /V is less than a factor of two, 
and R and N. are integers. To solve for/?, substitute 
equation (6) for /V in equation (8). The resulting 
equation is 



< (N - 1)2* + 1 <2 h 



or 



' " ; ( v — ' 



</?<log, 



2 b - 1 



1 



N 



(9) 



(10) 



Since 2 < N a < N. y the range of the expression in 
equation (10) must be less than one. Hence, given 
that R is an integer, 



R = int J log 



(ii) 



/V i n equation (6) i s now specified. 

As an example, consider the case where N () 
equals 87 (levels), b equals 9 (bits), A^ equals 1,024 
(levels, for a 32-by-32 template), and N r equals 256 
(levels). Thus, R equals 2, and the R-bit shifter drops 
the least-significant 2 bits. A^ equals 345 (levels); 
the dither array is normalized by equation (2) with 
\ f = 1/256; and the gain factor to be included in the 
modified adjust LUT is 344/255. This data is loaded 
into the system represented by Figure 4 and uni- 
formly maps input pixels across the 87 true output 
levels, giving the illusion of 256 levels. 

The output image that results from either of the 
dithering systems illustrated in Figure 3 or Figure 4 
appears to contain more effective levels than are 
actually displayed. An effective level is either a per- 
ceived average level that is dithered between two 
true output levels or shades or an actual true out- 
put level. A small number of template levels i\ f dic- 
tates the resulting number of effective levels. When 
N f is large, the number of effective levels is equal 
to the number of input levels /V, because it is not 



possible to display more effective outputs than 
inputs. More precisely, 



f A 0 
(A!, - 1)/V,+ 1 — 1 



Effective Levels 



A; 



^ <1. 



(12) 



a; 



iNote that & 0 /N t in equation (12) is equal to 
When A r/ < 1, the normalization of the dither array, 
i.e., equation (2), results in integer truncated values 
that are not all unique. At this point, the number of 
effective levels saturates to A 7 ,.. 

Data Width Analysis 

The design of an efficient dithering system, particu- 
larly in hardware, depends on knowing the number 
of bits required for all data paths in the system. This 
section presents an analysis of the one-memory 
dithering system shown in Figure 4. 

The system input b, i.e., the bit depth of the 
image input values, limits the data path for L t .[x, y]. 
The analysis shows the derivation of the precise bit 
depths for the other data paths. In summary, the 
derivation proves that the dither values in the 
dither matrix memory require R bits, where R max = 
(b - 1) and s = L. + d (and thus the R-bit shifter) 
require only b bits. 

Bits Needed for Dither Matrix The amount of 
memory needed to store the dither matrix is an 
important concern; d denotes the maximum 
value. To determine d max , substitute the maximum 
value of T\'x\y'\ s which is (JS t — 1), into equation 
(2). The resulting equation is 



d. t 



int 



int 



9* 1 
^«A- l)+ 7 ) 



(13) 



d max , which depends on /V ; , thus has a value in the 
range 

(14) 



2"~ ] < cl < 2 k 

— nui.x ~ 



1. 



For the case of a dither matrix with one value, 
namely /\ = 1, d n equals the lower end of this 
range. d u equals the high end of the range for 
large dither matrices, where 2 R ~ X < N r An impor- 
tant observation is that for all values in the range of 
expression (14), the number of bits needed is 
exactly/?. 
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From equation (11), the value of R increases asyV 
decreases. The smallest possible value of is 2, 
which is for bitonal output. In this case, the maxi- 
mum value of R is 

r ,hux = int{log 2 (2"- 1)) = (b - 1). (15) 

So, the number of bits needed for the dither values 
is /?, which can be as large as (b — 1). 

Bit Width of Adder Recall that s[x,y\ = L.[x y y] + 
d[x,y]. The number of bits needed for this sum 
determines the size of the adder and the size of 
the R-bit shifter. L i can be at most (/V - 1) and, as 
determined in the last section, d max can be at most 
(2 ,{ - 1). So, 

W CV - 0 + (2*- 1). (16) 
From equation (6), 

(N, - 1) = (.V 0 - 1)2*, (17) 

which gives 

W = 2 "( N o - 0 + (2" - 1) = 2% ~ 1. (18) 

We can express R in terms of yV by using equa- 
tion (11): 

R = int(log 2 (2" - 1) - log 2 (yV - 1)}. (19) 

Each of the two terms in the equation (19) can be 
expressed in terms of an integer part and a frac- 
tional part: 

log 2 (2"- l) = tf>- l) + e„ (20) 

where 

0 < e, < 1, 

and 

log 2 (/V - 1) = /C + e 2 , (21) 

where K is an integer, and 

0 < e 2 < 1. 

Now equation (19) can be rewritten as 

R = (b — \) — K + intje, - e 2 ). (22) 

e ? is largest when N n (an integer) is a large power of 
2. Because yV cannot be greater than N n 

2">N f> . 

This fact, combined with equations (20) and (21), 
yields the further condition 

e,>e 2 . 



Therefore, int{e 1 — e^} in equation (22) must be 
equal to zero, and the value of R becomes 

R = b - \ - K, (23) 

We can express in equation (18) in terms of the 
same integer K of equation (21 ) by noting that 

iog 2 N 0 = K+€^ (24) 

where 

0<e 3 <l. (25) 

Observe that e 3 is equal to 1, where N Q is an exact 
power of 2. Substituting 

and equation (23) into equation (18) gives 

*max = 2^ 1 - A '2 A ' + ^ - 1 - 2 fc - 1+Ci - 1. (26) 

Because of the range of e 3 in equation (25), the 
range of s must be 

2""' -1<J«^-1, (27) 

which requires exactly b bits. 

As a check, the size of the shift register should 
equal the number of bits required for /V # plus R. The 
number of bits needed foryV o is 

intU + log 2 (/V - 1)). (28) 

Using the expression in equation (21), this value 
becomes 

int{l + K 4- e 2 ) = A' + 1. (29) 
So, the size of the shift register must be 

(/C + 1) + (b - 1 - AT) = b bits, 
which matches the maximum size of the sum 5. 

Color Space Conversion 

Referring once again to Figure 1, consider the final 
subsystem of a video rendering system — color 
space convert. Assuming a frame buffer that is 
expecting RGB data, color space conversion is not 
necessary if the source data is already represented 
in RGB, as in the case of graphics generation 
systems. However, motion video is essentially 
always transmitted and stored in a luminance- 
chrominance space. Such a representation allows 
subsampling of the chrominance, as mentioned ear- 
lier, which reduces bandwidth requirements; all 
video standards exploit this method of bandwidth 
reduction. It is also more intuitive to color adjust in 
a luminance-chrominance space. 
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Prior to proceeding to the quantize subsystem 
shown in Figure 1, all color components must be at 
the same final spatial resolution for a dithering 
method to work correctly. Chrominance compo- 
nents, then, need to be up-sampled to the same rate 
as luminance components. 

Although the chromaticities of the RGB primaries 
of the major television standards vary slightly, all 
television systems transmit and store the color data 



in YUV space. Y represents the achromatic compo- 
nent that is loosely called the luminance com- 
ponent. (The term luminance has a specific 
photometric definition that is not what is repre- 
sented in a video Y component.) U and V are color 
difference components, where U is proportional to 
(Blue - Y) and V is proportional to (Red - Y). 

Figure 5 is an orthographic projection of YUV 
space. Inside the YUV rectangular solid is the 




(Y-axis out) 




(V-axis in) 




(U-axis out) 



Figure 5 Feasible KGB Values in the YUV Color Space 
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parallelepiped of "feasible" RGB space. Feasible RGB 
points are those that are nonnegative and are not 
greater than the maximum supported value. For ref- 
erence, the corners of the RGB parallelepiped are 
labeled black (K), white (W), red (R), green (G), 
blue(B), cyan (C), magenta (iM), and yellow (L). RGB 
and YUV values are related linearly and can be inter- 
converted by means of a 3by-3 matrix multiply. 

In the United States video broadcast system, the 
chrominance plane (i.e., the U-V plane in Figure 5) 
is rotated 33 degrees by introducing a phase in the 
quadrature modulation of the chrominance signal. 
The resulting rotated chrominance signals are 
renamed I and Q (for inphase and quadrature), but 
the unmodu lated color space is still YUV. 

Figure 6 shows the back end of a rendering sys- 
tem that uses dithering as a quantization step prior 
to color space conversion. A serendipitous conse- 
quence of dithering is that color space conversion 
can be achieved by means of table look-up. The 
collective address formed by the dithered Y, U, and 
V values is small enough to require a reasonably 
sized color mapping LUT. There are two advantages 
to this approach. First, a costly dematrixing opera- 
tion is not required, and second, infeasible RGB val- 
ues can be intelligently mapped back to feasible 
space off-line during the generation of the color 
mapping LUT. 

This second advantage is an important one, 
because 77 percent of the valid YUV coordinates 
are in invalid RGB space, i.e., the space around the 
RGB parallelepiped in Figure 5. Color adjustments 
such as increasing the brightness or saturation can 
push otherwise valid KGB values into infeasible 
space. In alternative systems that perform color 
conversion by dematrixing, out-of-bounds RGB val- 
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Figure 6 System for D it hering Three-color 
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ues are simply truncated. This operation effectively 
maps colors back to feasible RGB space along lines 
perpendicular to a parallelepiped surface illus- 
trated in Figure 5, which can change the color in an 
undesirable way. The use of a color mapping LUT 
avoids these problems. 

Summary 

Video is becoming an increasingly important data 
type for desktop systems. This is especially true as 
distinctions between computing, consumer elec- 
tronics, and communications continue to blur 
While many factors contribute to the impression 
one has of the value of a product that displays infor- 
mation, the way the images look can make the 
biggest difference. This paper focuses on rendering 
system designs that are fast, low cost, produce 
good-quality video, and are conducive to hardware 
or software implementation. 
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Software Motion Pictures 

Software motion pictures is a method of generating digital video on general- 
purpose desktop computers without using special decompression hardware. The 
compression algorithm is designed for rapid decompression in software and gener- 
ates deterministic data rates for use from CD-ROM and network connections. The 
decompression part offers device independence and integrates well with existing 
window systems and application programming interfaces. Software motion pic- 
tures features a portable, low-cost solution to digital video playback. 



The necessary initial investment is one of the major 
obstacles in making video a generic data type, like 
graphics and text, in general-purpose computer 
systems. The ability to display video usually requires 
some combination of specialized frame buffer, 
decompression hardware, and a high-speed network. 

A software- only method of generating a video 
display provides an attractive way of solving the 
problems of cost and general access but poses chal- 
lenging questions in terms of efficiency. Although 
several digital video standards either exist or have 
been proposed, their computational complexity 
exceeds the power of most current desktop sys- 
tems. 1 In addition, a compression algorithm alone 
does not address the integration with existing win- 
dow system hardware and software. 

Software motion pictures (SMP) is both a video 
compression algorithm and a complete software 
implementation of that algorithm. SMP was specifi- 
cally designed to address all the issues concerning 
integration with desktop systems. A typical applica- 
tion of SMP on a low-end workstation is to play back 
color digital video at a resolution of 320 by 240 
pixels with a coded data rate of 1.1 megabits per 
second. On a DECstation 5000 Model 240 HX work- 
station, this task uses less than 25 percent of the 
overall machine resources. 

Together with suitable audio support (audio sup- 
port is beyond the scope of this paper), software 
motion pictures provides portable, low-cost digital 
video playback. 

The SMP Product 

Digital supplies SMP in several forms. The most 
complete version of SMP comes with the XMedia 
Toolkit. This toolkit is primarily designed for devel- 
opers of multimedia applications who include the 



SMP functionality inside their own applications. 
Figure 1 shows the user controls as displayed on a 
workstation screen. SMP players are also available 
on Digital's freeware compact disc (CD) for use 
with Alpha AXP workstations running the DEC 
OSF/1 AXP operating system. In addition, SMP play- 
back is included with several Digital products such 
as the video help utility on the SPIN (sound picture 
information networks) application, as well as other 
vendors' products, such as the Medialmpact multi- 
media authoring system. 2 

In the XMedia Toolkit, access to the SMP functions 
is possible through X applications, command line 
utilities, and C language libraries. The applications 
and utilities support simple editing operations, 
frame capture, compression, and other functions. 
Most of these features are intended for use by pro- 
ducers of simple file formats called SMP clips. 

The decompression functionality is offered as an 
X toolkit widget that readily integrates into the 
Open Software Foundation's (OSF) Motif-based 
applications. Multiple SMP codecs (compressors/ 
decompressors) on a given screen all share the 
same color resources with one another and with 
the Display PostScript X-server extension, which is 
offered by all major workstation vendors. It also 
plays well with the standard color allocations used 
in the Macintosh QuickDraw rendering system and 
Microsoft Windows standard color al locations. 

To facilitate flexible but simple access to entire 
films of SMP frames, SMP defines SMP clips. Rather 
than publishing that file format directly, all applica- 
tions and widgets are accessed through an encap- 
sulating library. This method allows future releases 
to have application- transparent changes to the 
underlying file structure and completely different 
ways to store and obtain SMP frames. 
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Figure I User Controls as Displayed on the 
Workstation Screen 

An example of the latter is the storage of SMP 
clips directly in a relational database system in 
which no files exist, such as SQL Multimedia. The 
video data is stored directly in database records, 
and the client receives the data through the stan- 
dard remote database access protocols. At the 
receiving client, the SMP clip l ibrary is used to gen- 
erate a virtual SMP clip for the application program 
by substituting a new read function. 

The SMP product also contains image converters 
that translate to and from the popular PBMPLUS fam- 
ily of image formats, allowing import and export to 
about 70 different image formats, including the 
Digital Document Interchange Format (DDIF). This 
allows the use of almost any image format as input 
for creation of SMP clips. 

Historical Background and 
Requirements 

In 1989 Digital's Distributed Multimedia Group 
experimented briefly with an algorithm called 
color cell compression (CCC) that had been 



described in 1986 by Campbell et al. 3 CCC is a cod- 
ing method that is optimized for rapid decom- 
pression of color images in software. We built a 
demonstrator that rapidly displayed CCC-coded 
images in a loop to create a motion video effect. 
The demonstrator then served as our study 
vehicle to create a usable product for digital video 
playback. 

Performing digital video entirely in software 
would stress the systems at all levels (I/O, proces- 
sor, and graphics), so we needed to establish upper 
bounds for what we could hope to achieve with our 
desktop systems and workstations. 

From the user's perspective, large sizes and high 
frame rates are desirable. These features need to be 
balanced with the limitations of real hardware. We 
modeled the data path through which digital video 
would have to flow in the system and measured the 
available resources on the slowest system we 
would use, a DECstation 2100. This workstation has 
a 12.5-megahertz (MHz) MIPS R2000 processor and 
a simple, 8-bit color frame buffer 

By merging this measurement with user feedback 
concerning the smallest acceptable image size and 
frame rate, we set our performance goal to play 
back movies of size 240 by 320 on the slowest 
DECstation processor with an 8-bit display at 15 
frames per second. Smaller viewing sizes are almost 
invisible on a typical high-resolution workstation 
screen. 

We settled for a frame rate of 15 frames per sec- 
ond. This rate is reasonably smooth: to the human 
eye, it appears as motion rather than separate 
images. It can be generated easily from 30 -frame 
source material, such as standard video used in 
North America and Japan, by taking every other 
frame. Consequently, on the DECstation 2100 we 
would have at most 

12.5 X 10 6 clock cycles/second 

(320 X 240 X 15) pixels/second = 1() - 85 clock 

cycles 
per pixel 

Thus, we must average no more than (approxi- 
mately) ten machine instructions to decode and 
render each pixel to the screen. 

In order to set our target for compression 
efficiency, we looked at the volume of data and pos- 
sible distribution methods. CD-ROM looked promis- 
ing, and this data rate was also chosen by the 
Motion Picture Experts Group (MPEG)-l standard.^ 
Hence our coded data rate goal was to maintain 
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a coded data rate for this size and frame rate 
that would allow playback from a CD-ROM. To 
achieve this goal, we limited the coded data rate 
for the video component to 135 to 142 kilobytes 
per second for video, leaving 8 to 15 kilobytes per 
second for audio. In addition, we had to limit fluc- 
tuations of the coded data rate to allow sensible 
use of bandwidth reservation protocols for play- 
back over a network without complex buffering 
schemes. 

More interesting were the issues that became 
apparent when we attempted to use the prototype 
for real applications. The digital video material had 
to be usable on a wide range of display types, and 
due to its large volume, keeping specialized ver- 
sions for different displays was prohibitive. We 
would have to adapt the rendition of the coded 
material to the device-dependent color capabilities 
of the target display at run time. 

Our design center used 8-bit color-mapped dis- 
plays. These were (and still are) the most common 
color displays, and the demonstrator was based 
on them. Integration of the video into applications 
in a multitasking environment necessitated that 
computational as well as color resources were 
available for use by other applications. The system 
would have to perform cooperative sharing of 
the scarce color resources on displays with limited 
color capabilities. 

From the perspective of portability, we needed 
to conform to existing Xll interfaces, without any 
hidden back doors into the window system. The 
X Window System affords no direct way of writing 
into the frame buffer Rather, the MITSHM extension 
is used to write an image into a shared memory seg- 
ment, and then the X server must copy it into the 
frame buffer. This method would impact our 
already strained CPU budget for the codec opera- 
tion. We would need to decompress video in our 
code and have the X server perform a copy opera- 
tion of the decompressed video to the screen, again 
using the main CPU. Quick measurements showed 
that the copy alone would use approximately 50 
percent of the CPU budget for an 8-bit frame buffer, 
and another 5 to 10 percent would be used by read- 
ing the coded data from I/O devices. 

With approximately five clock cycles per pixel 
yet to be rendered, it became clear why none of the 
standard video algorithms was of any use for such a 
task. We went back to the original CCC algorithm 
and started the development of software motion 
pictures. 



Comparison with Other 
Video Algorithms 

Today (early 1993), a number of digital video com- 
pression algorithms are in use. All of them are 
guarded closely as proprietary and therefore 
closed, and only one algorithm predates the devel- 
opment of SMP. Although we could not build on 
experiences with these for our work, we believe 
the internal working on most of them is similar to 
SiMP with some additions. 

A popular method for video compression is 
frame differencing. Rather than each frame being 
encoded separately, only those parts of the images 
that have changed relative to a preceding (or 
future) frame are encoded (together with the infor- 
mation that the other blocks did not change). This 
method works well for some input material, for 
example, in video conferences where the camera 
does not move. The method fails, however, on 
almost all other video material. 

To enable frame differencing on a wider range of 
input scenes, a method known as motion estima- 
tion is used by some algorithms. The encoder for an 
image sequence performs a search for blocks that 
have moved between frames and encodes the 
motion. This search step is computationally very 
expensive and usually defeats real-time encoding, 
even for special-purpose hardware. 

One of the earliest algorithms was digital video 
interactive (DVI) from Intel/IBM. It comes in two 
variations, real-time video (RTV) and production 
level video (PLV). RTV uses an unknown block 
encoding scheme and frame differencing. PLV 
acids motion estimation to this. RTV is comparable 
to SMP in compression efficiency, computationally 
more expensive, and much worse in image quality. 
PLV cannot be done in software and requires 
special-purpose supercomputers for compression. 
Compression efficiency of PLV is about twice as 
good as SMP, and image quality is somewhat better. 
The more recent INDEO video boards from Intel 
use RTV. 

In 1992 Apple introduced QuickTime, which 
contains several video compression codecs. The 
initial RoadPizza (RP) video codec uses simple 
frame differencing and a block encoding similar to 
CCC, but without the color quantization step. (This 
is a guess based on the visual appearance and per- 
formance characteristics.) Compression efficiency 
of RP is three times worse than SMP, and image qual- 
ity is comparable on 24 -bit displays and much 
worse than SMP on 8-bit displays. Performance is 
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difficult to compare since SMP does not yet run on 
{Macintosh computers. 

The newer Compact Video (CV) codec intro- 
duced in QuickTime version 1.5 is similar to CCC 
with frame differencing and has compression 
efficiency much closer to SMP. Image quality on 
8-bit displays is still lower than SMP, and compres- 
sion times are almost unusable (i.e., long). 

The newest entry into the market for software 
video codecs is the video 1 codec in Microsoft's 
Video for Windows product. Very little is known 
about it, but it seems to be close to CCC with frame 
differencing. Finally, Sun Microsystems has included 
CCC with frame differencing in their upcoming ver- 
sion of the XIL imaging library 

Three well-known standards for image and video 
compression have been established by the Joint 
Photographic Experts Group (JPEG) and the Motion 
Picture Experts Group (MPEG) committees of 
the International Organization for Standardization 
(ISO) and by the Comite Consultatif Internationale 
de Telegraphique et Telephonique (CCITT). These 
standards are computationally too expensive to be 
performed in software in all but the most powerful 
workstations today. 

The Algorithm 

The SMP algorithm is a pixel-based, lossy compres- 
sion algorithm, designed for minimum CPU loading. 
It features acceptable image quality, medium com- 
pression ratios, and a totally predictable coded data 
rate. No entropy-based or computationally expen- 
sive transform-based coding techniques are used. 
The downside of this approach is a limited image 
quality and compression ratio; however, for a wide 
range of applications, SMP quality is sufficient. 

Block Truncation Coding 
In 1978, the method referred to as block truncation 
coding (BTC) was independently reported in the 
United States by Mitchell, Delp, and Carlton and in 
Japan byKishimoto, Mitsuya, and Hoshida> s - 6J 

BTC is a gray-scale image compression technique. 
The image is first segmented into 4 by 4 blocks. For 
each block, the 16-pixel average is found and used 
as a threshold. Each pixel is then assigned to a high 
or a low group in relation to this threshold. An 
example of the first stage in the coding process is 
shown in Figure 2a, in which the sample mean 
is 101. Each pixel in the block is thus truncated to 
1 bit, based on this threshold (see Figure 2b). 
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(b) The average of 101 is used as a threshold 
to segment the block. 

Figure 2 Block Truncation Coding of 
a 4 by 4 Block 

For each of the two groups, the average is then 
calculated again, giving a low average, a, and a high 
average, b. Mathematically, the first and second sta- 
tistical moments of the block are preserved. 
Therefore, for a block of m pixels, with q pixels 
greater than the sample mean a* 2 , and sample vari- 
ance cj 2 , it can be shown that 

a = x - (J \fqAm-q) 
b = x + cr \/{m ~q)/q 

More intuitively, the bit mask represents the 
shape of things in the block, and the average lumi- 
nance and contrast of the block contents are pre- 
served. With this coding method, for blocks of 4 by 
4 pixels and 8-bit gray values, a 16 -bit mask and two 
8-bit values encode the 16 pixels in 32 bits for a rate 
of 2.0 bits per pixel. 

Color Cell Compression 

Lema and Mitchell first extended BTC to color 
by employing a luminance-chrominance space,* 
However, the direction taken by Campbell et al. 
was computationally faster for decoded In this 
approach, a luminance value is computed for each 
pixel. As in the BTC algorithm, the sample mean of 
the luminance in each 4 by 4 block is used to seg- 
ment pixels into low and high groups based on 
luminance values only; The 24-bit color values 
assigned to the low and high groups are found by 
independently solving for the 8-bit red, green, and 
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blue values. This allows each block to be repre- 
sented by a 16-bit mask and two 24-bit color values, 
for a coding rate of 4 bits per pixel. 

The 24-bit values are mapped to a set of 256 8-bit 
color index values by means of a histogram-based 
palette selection scheme known as the median cut 
algorithm. 9 Thus every block can be represented by 
two 8-bit color indices and the 16-bit mask, yielding 
2 bits per pixel; however, each image frame must 
a Iso send the table of 256 24-bit color values. 

Software Motion Pictures Compression 
With our goal of 320 by 240 image resolution play- 
back at 15 frames per second, straight CCC coding 
would have resulted in a data stream of more than 
292 kilobytes per second, which is well beyond the 
capabilities of standard CD-ROM drives. Thus SMP 
needed to improve the compression ratio of CCC 
approximately twofold. 

Given that we could not apply any of the more 
expensive compression techniques, we looked for 
computationally cheap data-reduction techniques. 
Since most of these techniques negatively impact 
image quality, we needed a visual test bed to judge 
the impact of each change. 

We computed the images off-line for a short 
sequence, frame by frame, and then preloaded the 
images into the workstation memory. The player 
program then moved the images to the frame buffer 
in a loop, allowing us to view the results as they 
would be seen in the final version. The use of this 
technique provided two advantages. First, we 
could discover motion artifacts that were invisible 
in any individual frame. Second, we could judge the 
covering aspects of motion, which tends to brush 
over some defects that look objectionable in a still 
frame. 

At first, interframe or frame difference coding 
looked like a reasonable technique for achieving 
better compression results without sacrificing 
image quality but this was highly dependent on the 
nature of the input material. Due to the low CPU 
budget, we could not use any of the more elaborate 
motion compensation algorithms, so even slight 
movements in the input video material largely 
defeated frame differencing. Typically, we achieved 
only 10 percent better compression with inter- 
frame coding, while introducing considerable 
complexity to the compression and decoding oper- 
ations. As a result, we dropped interframe coding 
and made SMP a pure intraframe method, simplify- 
ing editing operations and random access to 



digitized material. At the same time, this opened up 
use of SMP for still image applications. 

To reach our final compression ratio goal of 
approximately 1 bit per pixel, we settled for a com- 
bination of two subsampling techniques. Similar 
techniques have been independently described by 
Pins, who conducted an exhaustive search and eval- 
uation of compression techniques. 10 His findings 
served as a check on our experiments. 

Blocks with a low ratio of foreground-to-back- 
ground luminance (a metric that can be interpreted 
as contrast) are represented in SMP by a single color 
and no mask. This reduces the coded representa- 
tion to a single b)te compared to 4 bytes in CCC, 
which amounts to a fourfold subsampling of such 
blocks. No chrominance information enters into 
this decision. It is surprising, but even very marked 
chrominance differences in foreground/background 
pairs are readily accepted by the human eye. 

With the introduction of a second kind of block, 
additional encoding information was necessary to 
distinguish normal (structured) CCC blocks from 
the subsampled (flat) blocks. In the SMP encoding, 
this is handled by a bitmap with one bit flagging 
each block. 

Because the adaptive subsampling alone did not 
yield enough data reduction for our compression 
goal, we added fixed subsampling for the struc- 
tured blocks. The horizontal resolution of the 
structured blocks in SMP is halved relative to CCC by 
horizontally averaging two neighboring pixels, 
which reduces the number of bits in the mask from 
16 to 8. This reduction leads to blurred vertical 
edges but looks reasonable for natural video 
images. Fixed subsampling allowed the encoding of 
structured blocks with 3 bytes instead of 4 bytes. 

We reapplied these ideas to the original gray- 
scale block truncation algoritlim. We added a varia- 
tion to the format that does not use a color look-up 
table but interprets the foreground and background 
colors directly as luminance values. Images com- 
pressed in this format code gray-scale input mate- 
rial more compactly (there is no need to transmit 
the leading color look-up table as in CCC); they also 
do not suffer from the quantization band effects 
inherent in the color quantization used in the CCC 
algorithm. 

We varied the ratio of flat to structured blocks 
to effect a trade-off between image quality and 
compression ratio; however, the range of useful set- 
tings is relatively small. If too few structured blocks 
are allocated, the image essentially is scaled down 
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fourfold, which makes the image look very blocky. 
If too many structured blocks are allocated, regions 
of the image that have 1 ittle detail are encoded with 
unnecessary overhead. Over the wide range of 
images we tested, allocating between 30 percent 
and 50 percent of structured blocks worked best, 
yielding a bit rate of 0.9 to 1.0 bits per pixel. For 
color images, the overhead of the color table (768 
bytes) must be added. 

Decompression 

The most challenging part of the design of the 
SMP system, given the performance requirements, 
is the decompression step. Efficient rendering 
techniques of block-truncation coding are well 
known for certain classes of output devices. 3 
SMP improves on the implementations described 
in the literature by complementing the raw algo- 
rithm with efficient, device-independent rendering 
engines,^' 810 ' 11 To maximize code efficiency, a sepa- 
rate decompression routine is used for each display 
situation, rather than using conditionals in a more 
generic routine. The current implementation can 
render to 1-, 8-, and 24-bit displays. 

Decompression of BTC involves filling 4 by 4 
blocks of pixels with two colors under a mask. 
Because the size and alignment of the blocks is 
known, a very fast, fully unrolled code sequence 
can be used. Changes of brightness and contrast of 
the image can be rapidly adapted to different view- 
ing conditions by manipulating the entries of the 
colormap of the SMP encoding. Most of the work 
I ics in adaptation of the color content of the decom- 
pressed data to the device characteristics of the 
frame buffer. 

For displays with full-color capabilities (24-bit 
true color), the process is straightforward. The 
main problem is performing the copy of the decom- 
pressed video to the screen. Since 24 -bit data is usu- 
ally al located in 32-bit words, the amount of data to 
copy is four times the 8-bit case. Typically, SMP 
spends 90 percent of the CPU time in the screen 
copy on 24-bit systems. 

The more common and interesting case is to 
decompress to an 8-bit color representation. Given 
that SMP is an 8-bit, color-indexed format, it would 
seem straightforward to download the SMP frame 
color table to the window system color table and 
fill the image with the pixel indices directly. This 
method is impractical for two reasons. First, most 
window systems (including Xll) do not allow 
reservation of all 256 colors in the hardware color 



tables. Typically, applications and window man- 
agers use a lew of the entries for system colors and 
cursors. Quantizing down to a smaller number of 
colors (such as 240) could overcome this drawback 
to a certain degree; however, it would make the 
SMP-coded material dependent on the device char- 
acteristics of a particular window system. 

The second and much more problematic aspect 
is that the SMP frames in a sequence usually have 
different color tables. Consequently, each frame 
requires a change of color table that causes a kalei- 
doscopic effect for the windows of other applica- 
tions on the screen. In fact, flashing cannot be 
eliminated within the SMP window itself. 

Neither Xll nor other popular window systems 
such as Microsoft Windows allow reload of the 
color table and the content of an image at the same 
time. Therefore, regardless of whether the color 
table or image contents is modified first, a flashing 
color effect takes place in the SMP window. It may 
seem that the update would have to be done in a 
single screen refresh time as opposed to simultane- 
ously. This is true but irrelevant. Most window 
systems do not allow for such fine-grain synchro- 
nization; and for performance reasons, it was unre- 
alistic to expect to be able to update the image in a 
single, vertical blanking period. 

Alternative suggestions to avoid this problem 
have been proposed in the literature. One sugges- 
tion is to use a single color table for the entire 
sequence of frames. 10 11 This method is computa- 
tionally expensive and fails for long sequences and 
editing operations. Another proposes quantization 
to less than half of the available colors or partial 
updates of the colormap and use of plane masks." 
This alternative is not particularly portable 
between different window systems, and the use of 
plane masks can have a disastrous impact on perfor- 
mance for some frame-buffer implementations 
such as the CX adapter in the DECstation product 
line. 

Neither of these methods addresses the issue of 
monochrome displays or the use of multiple simul- 
taneous SMP movies on a single display. (This effect 
can be witnessed in Sun Microsystems' recent addi- 
tion of CCC coding to their XIL library.) To keep 
device influence out of the compressed material 
and to enable the use of SMP on a wide range of 
devices and window systems, a generic decoupling 
step was added between the colors in the SMP 
frame and the device colors used for rendition on 
the screen. 
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A well-known technique for matching color 
images to devices with a limited color resolution is 
dithering. Dithering trades spatial resolution for an 
apparent increase in color and luminance resolu- 
tion of the display device. The decrease in spatial 
resolution is less of an issue for SMP images because 
of their inherently limited spatial resolution capa- 
bility. Thus the only challenge was the computa- 
tional cost of performing dithering in real time. 

Fortunately, we found a dithering algorithm that 
allowed both good quality and high speed. 12 It 
reduces quantization and mapping to a few table 
look-up operations, which have a trivial hardware 
implementation (random access memory) and a 
reasonable software implementation with a few 
adds, shifts, and loads. 

The general software implementation of the 
dithering algorithm takes 12 instructions in the 
MIPS instruction set to map a single pixel to its out- 
put representation. For SMP decoding, two differ- 
ent colors at most are in each 4 by 4 block. With this 
distribution, the cost of dithering is spread over the 
16 pixels in each block. 

Another optimization used heavily in the 8-bit 
decoder is to manipulate 4 pixels simultaneously 
with a single machine instruction. This technique 
increases performance for decompressing and 
dithering to 3-2 instructions per pixel in the MIPS 
instruction set, including all loop overhead, decod- 
ing of the encoded data stream, and adjusting con- 
trast and brightness of the image (2.7 instructions 
per pixel for gray-scale). This efficiency is achieved 
by careful merging of the decoding, decompres- 
sion, and dithering phases into a single block of 
code and avoiding intermediate results written to 
memory. The cost of the 1-bit and 24-bit decoders is 
the same or lower 0.2 and 2.9 instructions per 
pixel, respectively). 

Compression 

The SMP compressor takes an input image, a desired 
coded image size, and an output buffer as argu- 
ments. It operates in five phases: 

■ Input scaling (optional) 

■ Block truncation (luminance) 

■ Flat block selection 

■ Color quantization (color SMP only) 

■ Encoding and output writing 



Although the initial scaling is not strictly part of 
the SMP algorithm, it is necessary for different input 
sources. Fast scaling is offered as part of both the 
library and the command-line SMP compressors. 
Instead of simple subsampling, true averaging is 
used to ensure maximum input image quality. 

The block truncation phase makes two passes 
through each 4 by 4 block of the input. The first 
pass calculates the luminance of each individual 
pixel and sums them to find the average luminance 
of the entire block. The second pass partitions the 
pixel pairs into the foreground and background 
sets and calculates their respective luminance and 
chrominance averages. 

The flat-block-selection phase uses the desired 
compression ratio to decide how many blocks can 
be kept as structured blocks and how many need to 
be converted to flat blocks. The luminance differ- 
ence of the blocks is calculated, and blocks in the 
low-contrast range are marked for transition to flat 
blocks. Because the total average was calculated for 
each block in the preceding phase, no additional 
calculations are needed for the conversion of 
blocks, and the mask is thrown away. Colors are 
entered into a search structure during this phase. 

The color quantization phase uses a median cut 
algorithm, biased to ensure good coverage of the 
color contents of the image rather than minimize 
the overall quantization error. The bias method 
ensures that small, colored objects are not lost due 
to large, smoothly shaded areas getting the lion's 
share of the color allocations. These small objects 
often are the important features in motion 
sequences and have a high visibility despite their 
small size. 

The final encoding phase builds the color table 
and matches the foreground/background colors of 
the blocks to the best matches in the chosen color 
table. 

The gray-scale compression can be much faster 
because neither the quantization nor the matching 
step need be performed. Also, only one- third of the 
uncompressed video data is usually read in, making 
gray-scale compression fast enough to enable real- 
time compression on faster workstations and video- 
conferencing type applications. 

This speed is partly due to the 8-bit restriction in 
the mask of each structured block. This restriction 
permits the algorithm to store all intermediate 
results of the block truncation step in registers on 
typical reduced instruction set computer (RISC) 
machines with 32 registers. The entire gray-scale 
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compression algorithm can be done on a MIPS 
R3000 with 8 machine instructions per input pixel 
on average, all overhead (except input scaling) 
included. 

Unfortunately, for color processing, SMP com- 
pression remains an off-line, non-real-time pro- 
cess, albeit a reasonably fast one at 220 instructions 
per pixel. A 25-MHz R3000 processor can process 
more than 40,000 frames in 24 hours (DECstation 
5000 Model 200, 320 by 240 at 15 frames per sec- 
ond, TX/PIP as frame grabber), equivalent to 45 min- 
utes of compressed video material per day. The 
more recent DEC 3000 AXP Model 500 workstation 
improves this number threefold, so special-purpose 
hardware for compression is unnecessary even for 
color SMP. 

Portability 

A crucial part of the SMP design for portability is the 
placement of the original SMP codec on the client 
side of the X Window System. This allows porting 
and use of SMP on other systems, without being 
at the mercy of a particular system vendor for inte- 
gration of the codec into their X server or window 
system. 

This placement is enabled by the efficiency of the 
SMP decompression engine, which allows many 
spare cycles for performing the copy of the decom- 
pressed, device-dependent video to the window 

system. 

Currently, SMP is offered as a product only on the 
DECstation family of workstations, but it has been 
ported to a variety of platforms, including 

■ DEC AXP workstations running the DEC OSF/1 
AXP operating system 

■ Alpha AXP systems running the OpenVMS oper- 
ating system 

■ DECpc AXP personal computers running the 
Windows NT AXP operating system 

■ VAX systems running the VMS operating system 

■ SunSPARCstation 

■ IBM RS/6000 system 

■ HP/PA Precision system 

■ SCO UNIX/Intel 

■ Microsoft Windows version 3.1 



Generally, porting the SMP system to another plat- 
form supporting the X Window System requires the 
selection of two parameters (host byte order and 
presence of the MITSHM extension) and then a com- 
pilation. The same codec source is used on all the 
above machines; no assembly language or machine- 
specific optimizations are used or needed. 

The port to Microsoft Windows shows that 
the same base technology can be used with other 
window systems, although parts specific to the win- 
dow system had to be rewritten. The codec code is 
essentially identical, but the extreme shortage of 
registers in the 80x86 architecture and the lack of 
reasonable handling of 32-bit pointers in C lan- 
guage under Windows warrant a rewrite in assem- 
bly language on this platform. We do not expect 
this to be an issue on Windows version 3-2, clue to 
be released later in 1993. 

Conclusion 

Software motion pictures offers a cost-effective, 
totally portable way of bringing digital video to the 
desktop without requiring special investments for 
add-on hardware. Combined with audio facilities, 
SMP can be used to bring a complete video playback 
to most desktop systems. The algorithm and imple- 
mentation were designed to be used from CD-ROMs 
as well as network connections. SMP seamlessly 
integrates with the existing windowing system soft- 
ware. Because of its potentially universal availabil- 
ity, SMP can serve an important function as the 
lowest common denominator for digital video 
across multiple platforms. 
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Digital Audio Compression 

Compared to most digital data types, with the exception of digital video, the data 
rates associated with uncompressed digital audio are substantial. Digital audio 
compression enables more efficient storage and transmission of audio data. The 
many forms of audio compression techniques offer a range of encoder and decoder 
complexity, compressed audio quality and differing amounts of data compression. 
The (±-law transformation and ADPCM coder are simple approaches with low- 
complexity, low-compression, and medium audio quality algorithms. The MPEG/ 
audio standard is a high-complexity high-compression, and high audio quality 
algorithm. These techniques apply to general audio signals and are not specifically 
tuned for speech signals. 



Digital audio compression allows the efficient stor- 
age and transmission of audio data. The various 
audio compression techniques offer different levels 
of complexity, compressed audio quality, and 
amount of data compression. 

This paper is a survey of techniques used to com- 
press digital audio signals. Its intent is to provide 
useful information for readers of all levels of experi- 
ence with digital audio processing. The paper 
begins with a summary of the basic audio digitiza- 
tion process. The next two sections present 
detailed descriptions of two relatively simple 
approaches to audio compression: /u-law and adap- 
tive differential pulse code modulation. In the fol- 
lowing section, the paper gives an overview of a 
third, much more sophisticated, compression 
audio algorithm from the Motion Picture Experts 
Group. The topics covered in this section are quite 
complex and are intended for the reader who is 
familiar with digital signal processing. The paper 



concludes with a discussion of software-only real- 
time implementations. 

Digital Audio Data 

The digital representation of audio data offers 
many advantages: high noise immunity, stability 
and reproducibility. Audio in digital form also 
allows the efficient implementation of many audio 
processing functions (e.g., mixing, filtering, and 
equalization) through the digital computer 

The conversion from the analog to the digital 
domain begins by sampling the audio input in regu- 
lar, discrete intervals of time and quantizing the 
sampled values into a discrete number of evenly 
spaced levels. The digital audio data consists of a 
sequence of binary values representing the number 
of quantizer levels for each audio sample. The 
method of representing each sample with an inde- 
pendent code word is called pulse code modulation 
(PCM). Figure 1 shows the digital audio process. 
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According to the Nyquist theory, a time-sampled 
signal can faithfully represent signals up to half the 
sampling rate. 1 Typical sampling rates range from 
8 kilohertz (kHz) to 48 kHz. The 8-kHz rate covers 
a frequency range up to 4 kHz and so covers most of 
the frequencies produced by the human voice. The 
48-kHz rate covers a frequency range up to 24 kHz 
and more than adequately covers the entire audible 
frequency range, which for humans typically 
extends to only 20 kHz. In practice, the frequency 
range is somewhat less than half the sampling rate 
because of the practical system limitations. 

The number of quantizer levels is typically a 
power of 2 to make full use of a fixed number of 
bits per audio sample to represent the quantized 
values. With uniform quantizer step spacing, each 
additional bit has the potential of increasing the 
signal-to-noise ratio, or equivalently the dynamic 
range, of the quantized amplitude by roughly 
6 decibels (dB). The typical number of bits per sam- 
ple used for digital audio ranges from 8 to 16. The 
dynamic range capability of these representations 
thus ranges from 48 to 96 dB, respectively. To put 
these ranges into perspective, if 0 dB represents the 
weakest audible sound pressure level, then 25 dB 
is the minimum noise level in a typical recording 
studio, 35 dB is the noise level inside a quiet home, 
and 120 dB is the loudest level before discomfort 
begins. 2 In terms of audio perception, 1 dB is the 
minimum audible change in sound pressure level 
under the best conditions, and doubling the sound 
pressure level amounts to one perceptual step in 
loudness. 

Compared to most digital data types (digital 
video excluded), the data rates associated with 
uncompressed digital audio are substantial. For 
example, the audio data on a compact disc (2 chan- 
nels of audio sampled at 44.1 kHz with 16 bits per 
sample) requires a data rate of about 1.4 megabits 
per second. There is a clear need for some form of 
compression to enable the more efficient storage 
and transmission of this data. 

The many forms of audio compression tech- 
niques differ in the trade-offs between encoder and 
decoder complexity, the compressed audio quality, 
and the amount of data compression. The tech- 
niques presented in the following sections of this 
paper cover the full range from the /u,-law, a low- 
complexity, low-compression, and medium audio 
quality algorithm, to MPEG/audio, a high-complex- 
ity, high-compression, and high audio quality algo- 
rithm. These techniques apply to general audio 



signals and are not specifically tuned for speech sig- 
nals. This paper does not cover audio compression 
algorithms designed specifically for speech signals. 
These algorithms are generally based on a model- 
ing of the vocal tract and do not work well for non- 
speech audio signals. vl The federal standards 1015 
LPC (linear predictive coding) and 1016 CELP (coded 
excited linear prediction) fall into this category of 
audio compression. 

jji-law Audio Compression 

The /u-law transformation is a basic audio compres- 
sion technique specified by the Comite Consultatif 
Internationale de Telegraphique et Telephonique 
(CCITT) Recommendation G.711. 5 The transfor- 
mation is essentially logarithmic in nature and 
allows the 8 bits per sample output codes to cover a 
dynamic range equivalent to 14 bits of linearly quan- 
tized values. This transformation offers a compres- 
sion ratio of (number of bits per source sample)/ 
8 to 1. Unlike linear quantization, the logarithmic 
step spacings represent low-amplitude audio sam- 
ples with greater accuracy than higher-amplitude 
values. Thus the signal-to-noise ratio of the trans- 
formed output is more uniform over the range of 
amplitudes of the input signal. The /u-law transfor- 
mation is 

255 - , , Xln (1 + M UI)forx>0 

ln(l + [l) 

y = 127 

127 - — X In (1 + /iLT|)forx< 0 

1 ln(l + /x) 

where m = 255, and x is the value of the input sig- 
nal normalized to have a maximum value of 1. The 
CCITT Recommendation G.711 also specifies a simi- 
lar A-law transformation. The /u-law transformation 
is in common use in North America and Japan for 
the Integrated Services Digital Network (ISDN) 
8-kHz-sampled, voice-grade, digital telephony ser- 
vice, and the A-law transformation is used else- 
where for the ISDN telephony. 

Adaptive Differential Pulse 
Code Modulation 

Figure 2 shows a simplified block diagram of 
an adaptive differential pulse code modulation 
(ADPCM) coder. 6 For the sake of clarity, the figure 
omits details such as bit-stream formatting, the pos- 
sible use of side information, and the adaptation 
blocks. The ADPCM coder takes advantage of the 
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fact that neighboring audio samples are generally 
similar to each other. Instead of representing each 
audio sample independently as in POM, an ADPCM 
encoder computes the difference between each 
audio sample and its predicted value and outputs 
the PCM value of the differentia]. Note that 
the ADPCM encoder (Figure 2a) uses most of the 
components of the ADPCM decoder (Figure 2b) to 
compute the predicted values. 

The quantizer output is generally only a (signed) 
representation of the number of quantizer levels. 
The requantizer reconstructs the value of the quan- 
tized sample by multiplying the number of quan- 
tizer levels by the quantizer step size and possibly 
adding an offset of half a step size. Depending on 
the quantizer implementation, this offset may be 
necessary to center the requantized value between 
the quantization thresholds. 

The ADPCM coder can adapt to the characteristics 
of the audio signal by changing the step size of 
either the quantizer or the predictor, or by chang- 
ing both. The method of computing the predicted 
value and the way the predictor and the quantizer 
adapt to the audio signal vary among different 
ADPCM coding systems. 

Some ADPCM systems require the encoder to 
provide side information with the differential 



PCM values. This side information can serve 
two purposes. First, in some ADPCM schemes 
the decoder needs the additional information to 
determine either the predictor or the quantizer 
step size, or both. Second, the data can provide 
redundant contextual information to the decoder 
to enable recovery from errors in the bit stream 
or to allow random access entry into the coded bit 
stream. 

The following section describes the ADPCM 
algorithm proposed by the Interactive Multimedia 
Association (IMA). This algorithm offers a compres- 
sion factor of (number of bits per source sample)/ 
4 to 1. Other ADPCM audio compression schemes 
include the CCITT Recommendation G.721 (32 kilo- 
bits per second compressed data rate) and 
Recommendation G.723 (24 kilobits per second 
compressed data rate) standards and the compact 
disc interactive audio compression algorithm. 7 " 

The IMA ADPCM Algorithm The IMA is a consor- 
tium of computer hardware and software vendors 
cooperating to develop a de facto standard for com- 
puter multimedia data. The IMA's goal for its audio 
compression proposal was to select a public- 
domain audio compression algorithm able to pro- 
vide good compressed audio quality with good 
data compression performance. In addition, the 
algorithm had to be simple enough to enable 
software-only, real-time decompression of stereo, 
44. 1-kHz-sampled, audio signals on a 20-megahertz 
(MHz) 386 -class computer. The selected ADPCM 
algorithm not only meets these goals, but is also 
simple enough to enable software-only, real-time 
encoding on the same computer. 

The simplicity of the IMA ADPCM proposal lies in 
the crudity of its predictor. The predicted value of 
the audio sample is simply the decoded value of the 
immediately previous audio sample. Thus the pre- 
dictor block in Figure 2 is merely a time-delay 
element whose output is the input delayed by one 
audio sample interval. Since this predictor is not 
adaptive, side information is not necessary for the 
reconstruction of the predictor. 

Figure 3 shows a block diagram of the quantiza- 
tion process used by the IMA algorithm. The quan- 
tizer outputs four bits representing the signed 
magnitude of the number of quantizer levels lor 
each input sample. 

Adaptation to the audio signal takes place only in 
the quantizer block. The quantizer adapts the step 
size based on the current step size and the quan- 
tizer output of the immediately previous input. 
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Figure 3 IMA ADPCM Quantization 



This adaptation can be done as a sequence of two 
table lookups. The three bits representing the 
number of quantizer levels serve as an index into 
the first table lookup whose output is an index 
adjustment for the second table lookup. This adjust- 
ment is added to a stored index value, and the 
range-limited result is used as the index to the sec- 
ond table lookup. The summed index value is 
stored for use in the next iteration of the step-size 
adaptation. The output of the second table lookup 
is the new quantizer step size. Note that given a 
starting value for the index into the second table 



lookup, the data used for adaptation is completely 
deducible from the quantizer outputs; side informa- 
tion is not required for the quantizer adaptation. 
Figure 4 illustrates a block diagram of the step-size 
adaptation process, and Tables 1 and 2 provide the 
table lookup contents. 

IMA ADPCM: Error Recovery A fortunate side 
effect of the design of this ADPCM scheme is 
that decoder errors caused by isolated code word 
errors or edits, splices, or random access of the 
compressed bit stream generally do not have a 
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Table 1 First Table Lookup for the IMA 
ADPCM Quantizer Adaptation 
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Index 


Magnitude 


Adjustment 


000 
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001 


-1 
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011 


-1 


100 
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101 
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110 


6 


111 


8 



disastrous impact on decoder output. This is usu- 
ally not true for compression schemes that use 
prediction. Since prediction relies on the correct 
decoding of previous audio samples, errors in 
the decoder tend to propagate. The next section 
explains why the error propagation is generally 



limited and not disastrous for the IMA algorithm. 
The decoder reconstructs the audio sample, Xp[n] y 
by adding the previously decoded audio sample, 
Xp[n~ 1], to the result of a signed magnitude prod- 
uct of the code word, C[n], and the quantizer step 
size plus an offset of one-half step size: 

Xp[n] = Xp[n-\] + step_size[rc] X C\n] 

where C'\n\ - one-half plus a suitable numeric 
conversion of C[n]. 

An analysis of the second step-size table lookup 
reveals that each successive entry is about 1.1 times 
the previous entry. As long as range limiting of the 
second table index does not take place, the value 
for step_size [n] is approximately the product of the 
previous value, step_size[^ — 1 ], and a function of 
the code word, F(C[n - 1 ] ) : 

step_size[rc] = step_size[>?-l] X F(C[n~\]) 

The above two equations can be manipulated 
to express the decoded audio sample, Xp[n], as a 



Table 2 Second Table Lookup for the IMA ADPCM Quantizer Adaptation 
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function of the step size and the decoded sample 
value at time, m, and the set of code words 
between time, m, and n 

Xp[n] = Xp[m] + step_size[ra] 

n i 

X X (IT F(C[j]))XC[i] 
i=m+l j= m + 1 

Note that the terms in the summation are only 
a function of the code words from time m + l 
onward. An error in the code word, C[q] f or a ran- 
dom access entry into the bit stream at time q can 
result in an error in the decoded output, Xp[q], and 
the quantizer step size. step_size[g+ 1]. The above 
equation shows that an error in Xp[m] amounts to 
a constant offset to future values of Xp[n]. This 
offset is inaudible unless the decoded output 
exceeds its permissible range and is clipped. 
Clipping results in a momentary audible distortion 
but also serves to correct partially or fully the offset 
term. Furthermore, digital high-pass filtering of the 
decoder output can remove this constant offset 
term. The above equation also shows that an error 
in step_size[ra + 1] amounts to an unwanted gain or 
attenuation of future values of the decoded output 
Xp[n]. The shape of the output wave form is 
unchanged unless the index to the second step-size 
table lookup is range limited. Range limiting results 
in a partial or fill 1 correction to the value of the step 
size. 

The nature of the step-size adaptation I imits the 
impact of an error in the step size. Note that an 
error in step_size[ra+ 1] caused by an error in a sin- 
gle code word can be at most a change of (l.l) 9 , or 
7.45 dB in the value of the step size. Note also that 
any sequence of 88 code words that all have magni- 
tude 3 or less (refer to Table 1) completely corrects 
the step size to its minimum value. Even at the low- 
est audio sampling rate typically used, 8 kHz, 88 
samples correspond to 11 milliseconds of audio. 
Thus random access entry or edit points exist 
whenever 11 milliseconds of low-leveJ signal occur 
in the audio stream. 

MPEG/Audio Compression 

The Motion Picture Experts Group (MPEG) audio 
compression algorithm is an International Organi- 
zation for Standardization (ISO) standard for high- 
fidelity audio compression. It is one part of a 
three-part compression standard. With the other 
two parts, video and systems, the composite 



standard addresses the compression of synchro- 
nized video and audio at a total bit rate of roughly 
1.5 megabits per second. 

Like /x-law and ADPCM, the MPEG/audio compres- 
sion is lossy; however, the MPEG algorithm can 
achieve transparent, perceptually lossless com- 
pression. The MPEG/audio committee conducted 
extensive subjective listening tests during the 
development of the standard. The tests showed 
that even with a 6-to-l compression ratio (stereo, 
16-bit-per-sample audio sampled at 48 kHz com- 
pressed to 256 kilobits per second) and under opti- 
mal listening conditions, expert listeners were 
unable to distinguish between coded and original 
audio clips with statistical significance. Further- 
more, these clips were specially chosen because 
they are difficult to compress. Grewin and Ryden 
give the details of the setup, procedures, and 
results of these tests. 9 

The high performance of this compression algo- 
rithm is due to the exploitation of auditory mask- 
ing. This masking is a perceptual weakness of the 
ear that occurs whenever the presence of a strong 
audio signal makes a spectral neighborhood of 
weaker audio signals imperceptible. This noise- 
masking phenomenon has been observed and cor- 
roborated through a variety of psychoacoustic 
experiments. 10 

Empirical results also show that the ear has a lim- 
ited frequency selectivity that varies in acuity from 
less than 100 Hz for the lowest audible frequencies 
to more than 4 kHz for the highest. Thus the audible 
spectrum can be partitioned into critical bands that 
reflect the resolving power of the ear as a function 
of frequency. Table 3 gives a listing of critical band- 
widths. 

Because of the ear's limited frequency resolving 
power, the threshold for noise masking at any given 
frequency is solely dependent on the signal activity 
within a critical band of that frequency. Figure 5 
illustrates this property. For audio compression, 
this property can be capitalized by transforming 
the audio signal into the frequency domain, then 
dividing the resulting spectrum into subbands that 
approximate critical bands, and finally quantizing 
each subband according to the audibility of quanti- 
zation noise within that band. For optimal compres- 
sion, each band should be quantized with no more 
levels than necessary to make the quantization 
noise inaudible. The following sections present 
a more detailed description of the MPEG/audio 
algorithm. 
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Table 3 Approximate Critical Band 
Boundaries 



Band 
Number 


Frequency 
(Hz)* 


Band 
Number 


Frequency 
(Hz)* 

\l IS./ 
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50 


14 


1,970 
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95 
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2,340 
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2,720 
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17 


3,280 


4 
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18 


3,840 


c 








6 


560 


20 


5,440 
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660 


21 


6,375 
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22 


7,690 
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940 


23 


9,375 


10 


1,125 


24 


11,625 


11 


1,265 


25 


15,375 


12 


1,500 


26 


20,250 


13 


1,735 






* Frequencies are at the upper end of the band. 





MPEG/Audio Encoding and Decoding 
Figure 6 shows block diagrams of the MPEG/ 
audio encoder and decoder." 12 In this high-level 
representation, encoding closely parallels the pro- 
cess described above. The input audio stream 
passes through a filter bank that divides the input 
into multiple subbands. The input audio stream 
simultaneously passes through a psychoacoustic 
model that determines the signal-to-mask ratio of 
each subband. The bit or noise allocation block 
uses the signal-to-mask ratios to decide how to 
apportion the total number of code bits available 
for the quantization of the subband signals to mini- 
mize the audibility of the quantization noise. 
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Finally, the last block takes the representation of 
the quantized audio samples and formats the data 
into a decodable bit stream. The decoder simply 
reverses the formatting, then reconstructs the 
quantized subband values, and finally transforms 
the set of subband values into a time-domain audio 
signal. As specified by the MPEG requirements, 
ancillary data not necessarily related to the audio 
stream can be fitted within the coded bit stream. 

The MPEG/audio standard has three distinct lay- 
ers for compression. Layer I forms the most basic 
algorithm, and Layers n and 111 are enhancements 
that use some elements found in Layer I. Each suc- 
cessive layer improves the compression perfor- 
mance but at the cost of greater encoder and 
decoder complexity. 

Layer I The Layer I algorithm uses the basic filter 
bank found in all layers. This filter bank divides the 
audio signal into 32 constant-width frequency 
bands. The filters are relatively simple and provide 
good time resolution with reasonable frequency 
resolution relative to the perceptual properties of 
the human ear. The design is a compromise with 
three notable concessions. First, the 32 constant- 
width bands do not accurately reflect the ears criti- 
cal bands. Figure 7 illustrates this discrepancy. The 
bandwidth is too wide for the lower frequencies so 
the number of quantizer bits cannot be specifically 
tuned for the noise sensitivity within each critical 
band. Instead, the included critical band with the 
greatest noise sensitivity dictates the number of 
quantization bits required for the entire filter band. 
Second, the filter bank and its inverse are not loss- 
less transformations. Even without quantization, 
the inverse transformation would not perfectly 
recover the original input signal. Fortunately, the 
error introduced by the filter bank is small and 
inaudible. Finally, adjacent filter bands have a signif- 
icant frequency overlap. A signal at a single fre- 
quency can affect two adjacent filter bank outputs. 

The filter bank provides 32 frequency samples, 
one sample per band, for every 32 input audio sam- 
ples. The Layer I algorithm groups together 12 sam- 
ples from each of the 32 bands. Each group of 12 
samples receives a bit allocation and, if the bit allo- 
cation is not zero, a scale factor. Coding for stereo 
redundancy compression is slightly different and is 
discussed later in this paper. The bit allocation 
determines the number of bits used to represent 
each sample. The scale factor is a multiplier that 
sizes the samples to maximize the resolution of 
the quantizer. The Layer I encoder formats the 
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32 groups of 12 samples (i.e., 384 samples) into a 
frame. Besides the audio data, each frame contains 
a header, an optional cyclic redundancy code (CRC) 
check word, and possibly ancillary data. 

Layer U The Layer II algorithm is a simple 
enhancement of Layer I. It improves compression 
performance by coding data in larger groups. The 
Layer II encoder forms frames of 3 by 12 by 32 = 
1,152 samples per audio channel. Whereas Layer I 
codes data in single groups of 12 samples for each 



subband, Layer II codes data in 3 groups of 12 sam- 
ples for each subband. Again discounting stereo 
redundancy coding, there is one bit allocation and 
up to three scale factors for each trio of 12 samples. 
The encoder encodes with a unique scale factor for 
each group of 12 samples only if necessary to avoid 
audible distortion. The encoder shares scale factor 
values between two or all three groups in two 
other cases: (1) when the values of the scale factors 
are sufficiently close and (2) when the encoder 
anticipates that temporal noise masking by the ear 
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will hide the consequent distortion. The Layer U 
algorithm also improves performance over Layer I 
by representing the bit allocation, the scale factor 
values, and the quantized samples with a more effi- 
cient code. 

Layer ILL The Layer III algorithm is a much more 
refined approach. 1 ^ 1 ' Although based on the same 
filter bank found in Layers I and II, Layer III compen- 
sates for some filter bank deficiencies by process- 
ing the filter outputs with a modified discrete 
cosine transform (MDCT). Figure 8 shows a block 
diagram of the process. 

The MDCTs further subdivide the filter bank out- 
puts in frequency to provide better spectral resolu- 
tion. Because of the inevitable trade-off between 
time and frequency resolution, Layer III specifies 
two different MDCT block lengths: a long block of 36 
samples or a short block of 12. The short block length 
improves the time resolution to cope with tran- 
sients. Note that the short block length is one- third 
that of a long block; when used, three short blocks 
replace a single long block. The switch between 
long and short blocks is not instantaneous. A long 
block with a specialized long-to-short or short-to- 
long data window provides the transition mecha- 
nism from a long to a short block. Layer III has three 
blocking modes: two modes where the outputs of 
the 32 filter banks can all pass through MDCTs with 
the same block length and a mixed block mode 
where the 2 lower-frequency bands use long blocks 
and the 30 upper bands use short blocks. 

Other major enhancements over the Layer I and 
Layer II algorithms include: 



■ Alias reduction - Layer III specifies a method of 
processing the MDCT values to remove some 
redundancy caused by the overlapping bands of 
the Layer I and Layer II filter bank. 

■ Nonuniform quantization - The Layer III quan- 
tizer raises its input to the 3/4 power before 
quantization to provide a more consistent signal- 
to-noise ratio over the range of quantizer values. 
The requantizer in the MPEG/audio decoder 
relinearizes the values by raising its output to 
the 4/3 power. 

■ Entropy coding of data values - Layer III uses 
Huffman codes to encode the quantized samples 
for better data compression. 15 

■ Use of a bit reservoir - The design of the Layer III 
bit stream better fits the variable length nature of 
the compressed data. As with Layer II, Layer III 
processes the audio data in frames of 1,152 sam- 
ples. Unlike Layer II, the coded data representing 
these samples does not necessarily fit into a 
fixed-length frame in the code bit stream. The 
encoder can donate bits to or borrow bits from 
the reservoir when appropriate, 

■ Noise allocation instead of bit allocation - The 
bit al location process used by Layers I and II only 
approximates the amount of noise caused by 
quantization to a given number of bits. The Layer 
III encoder uses a noise allocation iteration 
loop. In this loop, the quantizers are varied in an 
orderly way, and the resulting quantization noise 
is actually calculated and specifically allocated 
to each subband. 
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The Psychoacoustic Model 
The psychoacoustic model is the key component of 
the MPEG encoder that enables its high perfor- 
mance. 16 ™ 19 The job of the psychoacoustic model 
is to analyze the input audio signal and determine 
where in the spectrum quantization noise will be 
masked and to what extent. The encoder uses this 
information to decide how best to represent the 
input audio signal with its limited number of code 
bits. The MPEG/audio standard provides two exam- 
ple implementations of the psychoacoustic model. 
Below is a general outline of the basic steps 
involved in the psychoacoustic calculations for 
either model. 

■ Time align audio data - The psychoacoustic 
model must account for both the delay of the 
audio data through the filter bank and a data 
offset so that the relevant data is centered within 
its analysis window. For example, when using 
psychoacoustic model two for Layer I, the delay 
through the filter bank is 256 samples, and the 
offset required to center the 384 samples of a 
Layer I frame in the 512-point psychoacoustic 
analysis window is (512 — 384)/2 = 64 points. 
The net offset is 320 points to time align the 
psychoacoustic model data with the filter bank 
outputs. 

■ Convert audio to spectral domain - The psy- 
choacoustic model uses a time-to-frequency 
mapping such as a 512- or 1,024-point Fourier 
transform. A standard Hann weighting, applied 
to audio data before Fourier transformation, 
conditions the data to reduce the edge effects of 
the transform window. The model uses this sep- 
arate and independent mapping instead of the 
filter bank outputs because it needs finer fre- 
quency resolution to calculate the masking 
thresholds. 

■ Partition spectral values into critical bands - To 
simplify the psychoacoustic calculations, the 
model groups the frequency values into percep- 
tual quanta. 

■ Incorporate threshold in quiet - The model 
includes an empirically determined absolute 
masking threshold. This threshold is the lower 
bound for noise masking and is determined in 
the absence of masking signals. 

■ Separate into tonal and nontonal components - 
The model must identify and separate the tonal 



and noiselike components of the audio signal 
because the noise-masking characteristics of the 
two types of signal are different. 

■ Apply spreading function - The model deter- 
mines the noise-masking thresholds by applying 
an empirical ly determined masking or spread ing 
function to the signal components. 

■ Find the minimum masking threshold for each 
subband - The psychoacoustic model calculates 
the masking thresholds with a higher-frequency 
resolution than provided by the filter banks. 
Where the filter band is wide relative to the criti- 
cal band (at the lower end of the spectrum), the 
model selects the minimum of the masking 
thresholds covered by the filter band. Where the 
filter band is narrow relative to the critical band, 
the model uses the average of the masking 
thresholds covered by the filter band. 

■ Calculate signal-to-mask ratio - The psycho- 
acoustic model takes the minimum masking 
threshold and computes the signal-to-mask 
ratio; it then passes this value to the bit (or 
noise) allocation section of the encoder. 

Stereo Redundancy Coding 
The MPEG/audio compression algorithm supports 
two types of stereo redundancy coding: intensity 
stereo coding and middle/side (MS) stereo coding. 
Both forms of redundancy coding exploit another 
perceptual weakness of the ear. Psychoacoustic 
results show that, within the critical bands cover- 
ing frequencies above approximately 2 kHz, the 
ear bases its perception of stereo imaging more 
on the temporal envelope of the audio signal than 
its temporal fine structure. All layers support inten- 
sity stereo coding. Layer 111 also supports MS stereo 
coding. 

In intensity stereo mode, the encoder codes 
some upper-frequency filter bank outputs with a 
single summed signal rather than send independent 
codes for left and right channels for each of the 32 
filter bank outputs. The intensity stereo decoder 
reconstructs the left and right channels based only 
on independent left- and right-channel scale fac- 
tors. With intensity stereo coding, the spectral 
shape of the left and right channels is the same 
within each intensity-coded filter bank signal, but 
the magnitude is different. 

The MS stereo mode encodes the signals for left 
and right channels in certain frequency ranges as 
middle (sum of left and right) and side (difference 
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of left and right) channels. In this mode, the 
encoder uses specially tuned techniques to further 
compress side-channel signal. 

Real-time Software Implementations 

The software-only implementations of the ^t-law 
and ADPCM algorithms can easily run in real time. A 
single table lookup can do ^i-law compression or 
decompression. A software-only implementation 
of the IMA ADPCM algorithm can process stereo, 
44. 1-kHz-sampled audio in real time on a 20-MHz 
386 -class computer. The challenge lies in develop- 
ing a real-time software implementation of the 
MPEG/audio algorithm. The MPEG standards docu- 
ment does not offer many clues in this respect. 
There are much more efficient ways to compute 
the calculations required by the encoding and 
decoding processes than the procedures outlined 
by the standard. As an example, the following sec- 
tion details how the number of multiplies and addi- 
tions used in a certain calculation can be reduced 
by a factor of 12. 

Figure 9 shows a flow chart for the analysis sub- 
band filter used by the MPEG/audio encoder. Most 
of the computational load is clue to the second- 
from-last block. This block contains the following 
matrix multiply: 



SHIFT IN 32 NEW SAMPLES 
INTO 512-POINT FIFO BUFFER, Xj 



WINDOW SAMPLES: 

FORi = 0TO511. DO Zj = Cj x Xj 



PARTIAL CALCULATION: 

7 

FOR i = 0 TO 63. DO Y, = £ Z , + 64j 

1 = 0 



CALCULATE 32 SAMPLES BY 

63 

MATRIXING Sj=X Yj x Mj 
k = 0 



OUTPUT 32 SUBBAND SAMPLES 



Figure 9 Flow Diagram of the MPEG/ A udio 
Encoder Filter Bank 
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5(0 X x cos 

k=0 



(2 x i+l) X (£-16) X n 



64 



for / = 0... 31. 



Using the above equation, each of the 31 values of 
5(0 requires 63 adds and 64 multiplies. To optimize 
this calculation, note that the M(i,k) coefficients 
are similar to the coefficients used by a 32-point, 
un-normalized inverse discrete cosine transform 
(DCT) given by 



31 



AO = S F <& x cos 

k=0 



(2 X i+l) X k X n 



64 



for* = 0... 31. 



Indeed, 5(0 is identical lofii) ifF(k) is computed 
as follows 

F(k) = YU6) fork = 0; 

= Y(k+ 16) + r(l6-&) for k = 1 . . . 16; 
= Y(k. + 16) - F(80-&) for k = 17 . . . 31. 



Thus with the almost negligible overhead of com- 
puting the F(k) values, a twofold reduction in mul- 
tiplies and additions comes from halving the range 
that k varies. Another reduction in multiplies and 
additions of more than sixfold comes from using 
one of many possible fast algorithms for the compu- 
tation of the inverse DCT. 2(1 - 1 11 There is a similar 
optimization applicable to the 64 by 32 matrix mul- 
tiply found within the decoder's subband filter 
bank. 

Many other optimizations are possible for both 
MPEG/audio encoder and decoder. Such optimiza- 
tions enable a software-only version of the MPEG/ 
audio Layer I or Layer II decoder (written in the C 
programming language) to obtain real-time per- 
formance for the decoding of high-fidelity mono- 
phonic audio data on a DECstation 5000 Model 200. 
This workstation uses a 25-MHz R3000 MIPS CPU 
and has 128 kilobytes of external instruction 
and data cache. With this optimized software, the 
MPEG/audio Layer II algorithm requires an average 
of 137 seconds of CPU time (12.8 seconds of user 
time and 0.9 seconds of system time) to decode 747 
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seconds of a stereo audio signal sampled at 48 kHz 
with 16 bits per sample. 

Although real-time MPEG/audio decoding of 
stereo audio is not possible on the DECstation 5000, 
such decoding is possible on Digital's workstations 
equipped with the 150-MHz DECchip 21064 CPU 
(Alpha AXP architecture) and 512 kilobytes of exter- 
nal instruction and data cache. Indeed, when this 
same code (i.e., without CPU-specific optimization) 
is compiled and run on a DEC 3000 AXP iModel 500 
workstation, the MPEG/audio Layer II algorithm 
requires an average of 4.2 seconds (3-9 seconds of 
user time and 0.3 seconds of system time) to 
decode the same 7.47-second audio sequence. 

Summary 

Techniques to compress general digital audio sig- 
nals include /u.-law and adaptive differential pulse 
code modulation. These simple approaches apply 
low-complexity, low-compression, and medium 
audio quality algorithms to audio signals. A third 
technique, the MPEG/audio compression algorithm, 
is an ISO standard for high-fidelity audio compres- 
sion. The MPEG/audio standard has three layers of 
successive complexity for improved compression 
performance. 
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The Megadoc Image Document 
Management System 

Megadoc image document management solutions are the result of a systems 
engineering effort that combined several disciplines, ranging from optical disk 
hardware to an image application framework. Although each of the component 
technologies may be fairly mature, combining them into easy-to-customize solu- 
tions presented a significant systems engineering challenge. The resulting applica- 
tion framework allows the configuration of customized solutions with loiv systems 
integration cost and short time to deployment. 



Electronic Document Management 

In most organizations, paper is the main medium 
for information sharing. Paper is not only a commu- 
nication medium but in many cases also the carrier 
of an organization's vital information assets. Whereas 
the recording of information in document format is 
done largely with help of electronic equipment, 
sharing and distribution of that information is in 
many cases still done on paper. Large-scale, paper- 
based operations have limited options for tracking 
the progress of work. 

The computer industry thus has two opportunities: 

1. Capture paper documents in electronic image 
format (if using paper is a requirement) 

2. Provide better tools for sharing and distribution 
among work groups (if the use of paper can be 
avoided) 

Organizations that use electronic imaging, as 
compared to handling paper, can better track work 
in progress. Productivity increases (no time is 
wasted in searching) and the quality of service 
improves (response times are shorter and no infor- 
mation is lost) when vital information is repre- 
sented and tracked electronically. 

Imaging is not a new technology (see Table 1). 
Moreover, this paper does not document new base 
technology: Instead, we describe the key compo- 
nents of an image document management system in 
the context of a systems engineering effort. This 
effort resulted in a product set that allows the con- 
figuration of customized solutions. 

Those who first adopted the use of image tech- 
nology have had to go through a long learning 



curve — a computer with a scanner and an optical 
disk does not fully address the issues of a large- 
scale, paper-based operation. Early adopters of 
electronic imaging experienced a challenge in 
defining the right electronic document indexing 
scheme for their applications. Even though the 
technology is now mature, the introduction of a 
document imaging system frequently leads to some 
form of business process reengineering to exploit 
the new options of electronic document manage- 
ment. The Megadoc image document management 
system allows the configuration of customer- 
specific solutions through its building-block archi- 
tecture and its built-in customization options. 

The Megadoc system presented in this paper is 
based on approximately 10 years of experience 
with base technology customer projects, and 
everything in between. In those years, Megadoc 
image document management has matured from 
the technology delight of optical recording to an 
application framework for image document man- 
agement. This framework consists of hardware and 
software components arranged in various architec- 
tural layers: the base system, the optical file server, 
the storage manager, and the image application 
framework. 

The base system consists of PC-based work- 
stations, running the Microsoft Windows operating 
system, connected to servers for storage manage- 
ment and to database services for document index- 
ing. Specific peripherals include image scanners, 
image printers, optional full-screen displays, and 
optional write once, read many (WORM) disks. 

The optical file server abstracts from the differ- 
ences between optical WORM disks and provides 
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Table 1 History of Image Document Management 

1975 Philips Research combines a 12-inch (30.48-centimeter) videodisk for analog storage of facsimile 

documents and high-resolution video monitors with a minicomputer for indexing in an experimental 
image management system. 

1979 Philips' image management system switches to digital technology through the availability of 

WORM disks and random-access memory (RAM) chips (for refreshing a full-page video monitor). 

1983 At the Hannover Fair (Hannover, Germany), Philips shows Megadoc, an image document 

management system with WORM disks containing compressed document images. Dedicated 
image document management solutions are introduced. 

1988 Image document management transitions from dedicated image display technology as part of a 

proprietary computer architecture to an open systems platform with PC-based image workstations. 

1993 The image becomes just another document format that is used next to text-coded electronic 
documents. 



the many hundreds of gigabytes (GB) of storage 
required in large-scale image document manage- 
ment systems. 

The storage manager provides storage and 
retrieval functions for the contents of documents. 
Document contents are stored in "containers," i.e., 
large, one-dimensional storage areas that can span 
multiple optical disk volumes. 

The Megadoc image application framework con- 
tains three sublayers: 

1. Image-related software libraries for scanning, 
viewing, and printing 

2. Application templates 

3. A standard folder management application that 
provides, with some tailoring by the end-user 
organization, an "out-of-the-box" image docu- 
ment management solution 

The optical file server and the storage manager 
store images in any type of document format. 
However, to meet customer requirements with 
respect to longevity of the documents, images 
should be stored in compressed format according 
to the Comite Consultatif Internationale de Tele- 
graphique et Telephonique (CCJTT) Group 4 
standard. 

In addition to image document management 
solutions, Megadoc components are used to "image 
enable' existing data processing applications. In 
many cases, a data processing application uses 
some means of identification for an application 
object (e.g., an order or an invoice). This identifica- 
tion relates to a paper document. Megadoc reuses 
the application's identification as the key to the 
image version of that document. Application pro- 
gramming interfaces (APIs) for terminal emulation 



packages that are running the original application 
in a window on the Megadoc image PC work- 
stations allow integration with the unchanged 
application. 

The following sections describe the optical file 
server, the storage manager, and the image applica- 
tion framework. 

Megadoc Optical File Ser ver 

The Megadoc optical file server (OFS) software pro- 
vides a UNIX file system interface for WORM disks. 
The OFS automatically loads and unloads these 
WORM volumes by jukebox robotics in a completely 
transparent way. Thus, from an API perspective, OFS 
implements a UNIX file system with a large on-line 
file system storage capacity. Currently, up to 800 GB 
can be reached with a single jukebox. 

We implemented the OFS in three layers, as 
shown Figure 1 : 

1. The optical disk filer (ODF) layer, which enables 
storing data on write-once devices and provid- 
ing a UNIX file system interface. 

2. The volume manager (VM), which loads and 
unloads volumes to and from drives in the juke- 
boxes and communicates with the system opera- 
tor for handling off line volumes. 

3. The device layer, which provides device-level 
access to the WORM drives and to the jukebox 
hardware. This layer is not discussed further in 
this paper. 

Optical Disk Filer 

When we started to design the ODF, the chief 
prerequisite was that it should adhere to the UNIX 
fife system interface for applications. The obvious 
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Figure 1 Tloe Three Software Layers of the 
Optical File Server 

benefit was that the designers would not have to 
write their own utilities to, for example, copy data, 
create new files, and make new directories. All 
UNIX utilities would work as well on WORM devices 
as on any other file system. 

Current UNIX implementations provide two ker- 
nel interfaces for integrating a new file system type 
into the kernel: the file system switch (FSS), in UNIX 
versions based on the System V Release 3; and the 
virtual file system (VFS), in UNIX implementations 
like the System V Release 4, SunOS, and OSF/1 oper- 
ating systems. We introduced the optical disk filer 
in the FSS and later ported it to the VFS. 

The key challenge for the design of a file system 
for write-once devices is to allow updates without 
causing an "avalanche" of updates. Note that any 
update to a sector on a WORM device forces a 
rewrite of the full sector at another location. If 
pointers to an updated sector exist on the WORM 
device, sectors that contain those pointers have to 
be rewritten, also. For example, if a file system 
implementation is chosen where the list of data 
blocks for a file, or just the sector location of such a 
list, is part of the file's directory information, any 
update to that file would cause a rewrite of the 
directory sector and the sectors for the parent 
directories, all the way up to the root directory. 

A second issue to be addressed for removable 
optical disks is performance. Access time for on-line 
disks is at least eight times slower than for current 



magnetic disks. (The average seek time for a WORM 
device is 100 milliseconds; rotational delay is about 
35 milliseconds.) Fetching a disk from a jukebox 
storage slot, loading it, and waiting for spin-up 
takes between 8 and 15 seconds, depending on the 
type of jukebox. 

Caching solves both issues. We decided that the 
usual in-memory cache would not be sufficient for 
the huge amounts of WORM data, and therefore, we 
use partitions of magnetic disks for caching. 

ODF WORM Layout To avoid duplicating previ- 
ous efforts, we used classical UNIX file systems as 
a guideline for the definition of ODF's WORM layout. 
However, we had to add some indirect pointer 
mechanisms to avoid update avalanches. Each file 
system is mapped onto a single WORM partition. 
These partitions are written sequentially, reducing 
the free block administration to maintaining a cur- 
rent write point. 

The ODF reuses many notions from UNIX file sys- 
tems, such as i-nodes, superblock, and the func- 
tional contents of directory entries. 1 Applying 
these UNIX notions to the optical file system 
resulted in the following ODF characteristics: 

■ The superblock contains all global data for a file 
system. 

■ Each i-node contains the block list and all the 
attributes of a file except the file's name. 

■ An i-node number identifies each i-node. 

■ A directory is a special type of file. 

■ Entries in a directory map names to i-node 
numbers. 

A new notion in the ODF, as compared to UNIX 
file systems, is the administration file (admin file). 
One such file exists for each file system. The file is 
sequential, and its contents are similar to the first 
disk blocks in classical UNIX file systems: the first 
extent contains the superblock, and all other 
extents form a constantly growing array of i-nodes; 
the i-node's number is the index of the i-node in the 
file's i-node array An important difference between 
UNIX file systems and the ODF is that the 2-kilobyte 
(kB), fixed-size extents of the ODF admin file are 
scattered over the WORM device, instead of being 
stored as a sequential array of disk blocks, as in 
UNIX systems. As a result, any update to an i-node, 
as a consequence of a file update, causes the invali- 
dation of at most one admin file extent. Since 
the logical index in the admin file of this i-node, i.e., 
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the i-node number, does not change, the parent 
directories do not have to be updated. 

However, this scheme needs an additional indi- 
rect pointer mechanism: a list of block numbers 
representing the location of the admin file extents. 
The ODF stores this list in the admin file's i-node 
(aino). The aino is a sequential file that contains 
slightly more than block numbers and is a sequence 
of contiguous blocks on the WORM disk that con- 
tain the same information. Hence, an update to an 
admin file extent always invalidates the entire aino 
on the WORM device, which makes the aino a more 
desirable candidate for caching than the admin file 
extents. 

The following example, shown in Figure 2, illus- 
trates the steps involved in reading logical block N 
from the file with i-node number/: 

1. Read the aino to obtain the block number of Fs 
admin file extent. 

2. Read the admin file extent to get file /, which is 
used to translate the logical block number N into 
the physical block number /(TV). 

3. Read physical block I(N). 

If the file system is in a consolidated state, i.e., all 
data on the WORM disk is current, the aino and the 
superblock are the last pieces of information writ- 
ten to the WORM device, directly before the current 
write point. Blocks written prior to the aino and 
the superblock contain mainly user data but also 
an occasional admin file extent, fully interleaved. 
Figure 3 shows the WORM layout. Since ODF 
requires the first admin file extent and the com- 
plete aino to be in the cache, introducing a disk 
with consolidated file systems to another system 
requires searching the current write point, reading 
the superblock, determining the aino length from 
the superblock, and finally reading the aino itself. 



Searching the current write point is a fairly fast 
operation implemented through binary search and 
hardware support, which allow the ODF to distin- 
guish between used and unused data blocks of IK 
bytes. 

ODF Caching Caching in the ODF is file oriented. 
We suggest a magnetic cache size of approximately 
5 percent of the optical disk space. If data from a 
file on a WORM disk is read, the ODF creates a cache 
file and copies a contiguous segment of file data 
from the WORM disk (64 kB in size, or less in the 
case of a small file) to the correct offset in the cache 
file. The cache file is the basis for all I/O operations 
until removed by the ODF, after having rewritten all 
dirty segments (i.e., updated or changed segments) 
back to the WORM device. The ODF provides special 
system calls (through the UNIX fcntl(2) interface) 
to flush asynchronously dirty file segments to the 
WORM device and to remove a file's cache file. The 
f lusher daemon monitors high and low watermarks 
for dirty cache contents. The daemon flushes dirty 
data to the optical disks. The flusher daemon 
flushes data in a sequence that minimizes the num- 
ber of WORM volume movements in a jukebox. The 
ODF deletes clean data (i.e., data already present on 
the optical disk) on a least-recently-used basis. 

The admin file has its own cache file. The mini- 
mum amount of admin file data to be cached is the 
superblock. The ODF gradually caches the other 
admin file extents, which contain the i-nodes, while 
the file system is in use. The ODF writes i-node 
updates to the WORM device as soon as all i-nodes in 
the same admin file extent have their dirty file data 
written to the WORM device. The aino has its own 
cache file, also, and is always completely cached. 
If all file data and i-nodes have been written to the 
WORM device, the file system can be consolidated 
by a special utility that writes aino and superblock 
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Figure 3 WORM Layout for a Consolidated ODFFile System 



to the WORM device, hence creating a consolida- 
tion point. 

For reasons of modularity and ease of implemen- 
tation, we chose the UNIX standard magnetic disk 
file system implementation to perform the caching. 
An alternative would have been to use a magnetic 
disk cache with an optimized, ODF-specific struc- 
ture. We opted for a small amount of overhead, 
which would allow us to add a faster file system, 
should one become available. Our performance 
measurements showed a loss of less than 10 percent 
in performance as compared to that of an ODF- 
specific solution. The cache file systems on mag- 
netic disk can be accessed only through the ODF 
kernel component. Thus, in an active OFS system, 
no application can access and, therefore, possibly 
corrupt the cached data. 

Volume Manager 

In addition to hiding the WORM nature of the under- 
lying physical devices, the OFS transparently moves 
volumes between drives and storage slots in juke- 
boxes that contain many volumes ("platters")- The 
VM performs this function. 

The essential characteristic of the volume man- 
agement layer is its simple functionality, which 
is best described as a "volume faulting device." 
The interface to the VM consists of volume device 
entries, each of which gives access to a specific 
WORM volume in the system. For example, the vol- 
ume device entry /dev/WORM_A gives access to the 
WORM volume \VORM_A. This volume device entry 
has exactly the same interface as the usual device 
entry such as /dev/worm, which gives access to 
a specific WORM drive in the system, or rather 
to any volume that happens to be on that drive at 
that moment. Any access to a volume device, e.g., 
/dev/WORM_A, either passes directly to the drive on 
which the volume (WORM_A) is loaded, or results in 
a volume fault. This last situation occurs when the 



volume is in a jukebox slot and not in a directly 
accessible drive. Note that since /dev/WORM_A has 
the same interface as /dev/worm, the OFS could 
function without the VM layer in any system that 
contains only one worm drive and one volume that 
is never removed from that drive. However, since 
this configuration is not a realistic option, the OFS 
includes the VM layer. 

The internal architecture of the VM is more com- 
plicated than its functionality might indicate. The 
VM consists of a relatively small kernel component 
and several server processes, as illustrated in Figure 
4. The kernel component is a pseudo-device driver 
layer that receives requests for the volume devices, 
e.g., /dev/WORM_A, and translates these requests 
into physical device driver (/dev/worm) requests 
using a table that contains the locations of loaded 
volumes. If the location of a volume can be found in 
the table, the I/O request is directly passed on to the 
physical device. Otherwise, a message is prepared 
for the central VM server process, and the volume 
server and the requesting application are put in a 
waiting state. 

The volume server uses a file to translate volume 
device numbers into volume names and locations. 
It communicates with two other types of VM server 
processes: jukebox servers and drive servers. The 
jukebox servers take care of all movements in 
their jukebox. Drive servers spin up and spin down 
their drive only on request from the volume server. 

Storage Manager 

The storage manager implements containers, as 
mentioned in the Electronic Document Manage- 
ment section. Large-scale document management 
uses indexing of multiple storage and retrieval 
attributes, typically with the help of a relational 
database. Once the contents of a document are 
identified through a database query on its attri- 
butes, a single pointer to the contents is sufficient. 
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Figure 4 Global Architecture Showing the VM Component 



Also, there is little need for a hierarchically struc- 
tured file system. Containers provide large, flat 
structures where the contents of a document are 
uniquely defined by the container identification 
and a unique identification within the container. 
The document's contents identification is translated 
by the storage manager in a path to a directory where 
one or more contents files can be written. For multi- 
page image documents, the Megadoc system stores 
each page as a separate image file in a directory 
reserved for the document. This scheme guarantees 
locality of reference, avoiding unnatural delays 
while browsing a multipage image document. 

A container consists of a sequence of file sys- 
tems, typically spanning multiple volumes. Due to 
the nature of the OFS, no distinction has to be made 
between WORM disk file systems and magnetic disk 
file systems. The storage manager fills containers 
sequentially, up to a configurable threshold for 
each file system, allowing some degree of local 
updates (e.g., adding an image page to an existing 
document). As soon as a container becomes ful I, a 
new file system can be added. 

Containers in a system are network-level 
resources. A name server holds container locations. 
Relocation of the volume set of a container to 
another jukebox, e.g., for load balancing, is possible 
through system management utility programs and 
can be achieved without changing any application's 
indexing database. 

RetrievAll — The Megadoc Image 
Application Framework 

Early iMegadoc configurations required extensive 
system integration work. RetrievAll is the second- 
generation image application framework (IAF). The 



first generation was based on delivery of source of 
example applications. However, tracking source 
changes appeared to be too big of an issue and ham- 
pered the introduction of new base functionality. 

In cooperation with European sales organi- 
zations, we formulated a list of requirements for a 
second-generation IAF. The framework must 

1. Allow for standard applications. Standard appli- 
cations, i.e., scan, index, store, and retrieve, cover 
a wide range of customer requirements in folder 
management. Tailoring standard applications 
can be accomplished in one day, without pro- 
gramming effort. 

2. Be usable in system integration projects. The 
IAF must provide APIs for folder management, 
allowing the field to build applications with 
functionality beyond the standard applications 
by reusing parts of the standard applications. 

3. Allow image enabling of existing applications. 
RetrievAll shou ld allow the linkage of electronic 
image documents and folders with entities, such 
as order number or invoice number, in existing 
applications. Existing applications need not 
be changed and run on the image workstation 
using a terminal emulator running at the image 
workstation. 

4. Accommodate internationalization. All text pre- 
sented by the application to the end user should 
be in the native language of the user. RetrievAll 
should support more than one language simulta- 
neously for multilingual countries. 

5. Allow upgrading. A new functional release of 
RetrievAll should have no effect on the customer- 
specific part of the application. 
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6. Provide document routing. After scanning the 
documents, RetrievAlL should route references 
to new image documents to the in-trays of users 
who need to take action on the new documents. 

Image Documents in 
Their Production Cycle 

Image documents start as hard-copy pages that 
arrive in a mailroom, where the pages are prepared 
for scanning. Paper clips and staples are removed, 
and the pages are sorted, for example, per depart- 
ment. An image batch contains the sorted stacks of 
pages. The scanning application identifies batches 
by a set of attributes. The scanning process offers 
a wide variety of options, including scanning one 
page or multiple pages, accepting or rejecting the 
scanned image for image quality control, batch 
importing from a scanning subsystem, browsing 
through scanned pages, and controlling scanner 
settings. 

The indexing process regroups image pages of an 
image batch into multipage image documents. Each 
document is identified with a set of configurable 
attributes and optionally stored in one or more 
folders. Folders also carry a configurable set of 
attributes. On the basis of the attribute values, the 
document contents are stored in the document's 
storage location (container). 

Many users of RetrievAll applications use the 
retrieve functions of the application only to 
retrieve stored folders and documents. Folders and 
documents can be retrieved by specifying some of 
the attributes. RetrievAll allows the configuration 
of query forms that represent different views on the 
indexing database. The result of a query is a list of 
documents or folders. For documents, the opera- 
tions are view, edit, delete, print, show folder, and 
put in folder. The Megadoc editor is used to view 
and to manipulate the pages of the document 
including adding new pages by scanning or import- 
ing. For folders, the operations are list documents, 
delete, and change attributes. 

Document Routing Applications 
A RetrievAll routing application is an extension to a 
folder management application. A route defines 
how a reference to a folder travels along in-trays of 
users or work groups. 

Systems Management 

The following systems management functions sup- 
port the RetrievAll package: 
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■ Container management 

■ Security, i.e., user and group permissions 

■ Logging and auditing 

■ Installation, customization, tailoring, and local- 
ization 

Architecture and Overview 
As illustrated in Figure 5, the RetrievAll image appli- 
cation framework consists of a number of modules. 
Each module is a separate program that performs a 
specific function, e.g., scanning or document index- 
ing. Each module has an API to control its function- 
ality, and some modules have an end-user interface. 
Modules can act as building bricks under a control 
module. For example, an image document capture 
application uses 

1. Scan handling, to let an end user scan pages into 
a batch. 

2. Scanner settings, to allow the user to set and 
select the settings for a scanner. The user can 
save specific settings for later reference. 

3. Batch handling, to allow the end user to create, 
change, and delete batches. 

These three modules can operate together under 
the control of the scan control module and in this 
way form a document capture application. The 
scan control module can, under control of a main 
module, perform the document capture function 
in a folder management application. 

Modules communicate by means of dynamic data 
exchange (DDE) interfaces provided in the 
Microsoft Windows environment. Each module, 
except the main module, can act as a server, and all 
modules can act as clients in a DDE communication. 

Main Module Any RetrievAll application has a 
main module that controls the activation of major 
functions of the application. These functions 
include scanning pages into batches, identifying 
pages from batches into multipage image docu- 
ments and assigning documents to folders, and 
retrieving documents and folders. The main mod- 
ule presents a menu to select a major function. The 
main module activates the control modules of the 
major functions in an asynchronous way. For exam- 
ple, the main module can activate a second major 
function, e.g., retrieve, when the first major func- 
tion, e.g., identification, is still active. 
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Figure 5 RetrievAll Module Overview 



Control Modules Each major RetrievAll function 
has a control module that can run as a separate 
application. For example, when a PC acts as a scan 
workstation, it is not necessary to offer all the func- 
tionality by means of the main module. Control 
modules can be activated as a server through the 
DDE API with the main module as client or as a pro- 
gram item from a Microsoft Windows program 
group. 

Server Modules All modules, with the exception 
of the main module, act as DDE server modules. 

Configuration files hold environment data for 
each module. An application configuration file 
describes which modules are in the configuration. 
The layout of the configuration files is the same as 
the WTN.INI file used by the Microsoft Windows 
software, allowing the reuse of standard access 
functions. 

Making an Application 

An application can be made by selecting certain 
modules. Figure 5 gives an overview of the modules 
used for the standard folder management applica- 
tion. The installation program, which is part of the 
standard applications, copies the appropriate mod- 
ules to the target system and creates the configura- 
tion files. 

Modules can also be used with applications other 
than the standard ones. Image enabling an existing 
(i.e., legacy) application (see Figure 6), such as an 
order entry application where the scanned images of 
the orders should be included, entails the following: 



■ The existing application is controlled by a termi- 
nal emulator program running in the Microsoft 
Windows environment. This terminal emulator 
program must have programming facilities with 
DDE functions. 

■ While entering a new order into the system, the 
image document representing the order is on 
the screen. The function to include the image 
can be mapped on a function key of the emula- 
tor. Pressing the function key results in a DDE 
request to the identification function of the 
RetrievAll components. This DDE request passes 
the identification of the document (as known in 
the order entry application) to the identification 
function. 

Summary 

This paper has provided an overview of the many 
components and disciplines needed to build an 
effective image document management system. We 
discussed the details of the WORM file system, the 
storage manager technology, and the image applica- 
tion framework. Other aspects such as WORM 
peripheral technology, software compression and 
decompression of images, and the integration of 
facsimile and optical character recognition tech- 
nologies were not covered. 

From experience, we know that different cus- 
tomers have different requirements for image docu- 
ment management systems. The same experience, 
however, taught us to discover certain patterns 
in customer applications; we captured these pat- 
terns in the application framework. The resulting 
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Figure 6 Image Enabling a Legacy Application 



framework allows lis to build highly customized 
applications with low system integration cost and 
short time to deployment. Future directions are in 
the area of enhanced folder management and inte- 
grated distributed work flows. 
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The Design of Multimedia 
Object Support in DECRdb 

Storing multimedia objects in a relational database offers advantages over file 
system storage Digital's relational database software product DECRdb supports the 
storing and indexing of multimedia objects— text, still frame images, compound 
documents, audio, video, and any binary large object After evaluating the existing 
DECRdb version 3 I for its ability to insert fetch, and process multimedia data, soft- 
ware designers decided to modify many parts of Rdb and to use write-once optical 
disks configured in standalone drive or jukebox configurations. Enhancements 
were made to the buffer manager and page allocation algorithms, thus reducing 
wasted disk space. Performance and capacity field tests indicate that DECRdb can 
sustain a 200-kilobyteper-second SQL fetch throughput and a 57. 7-kilobyte-per- 
second SQL/Services fetch throughput, insert and fetch a 2-gigabyte object, and build 
a 50 -gigabyte database. 



To accommodate the increasing demand for com- 
puter storage and indexing of multimedia objects, 
Digital supports multimedia objects in its DEC Rdb 
relational database software product. This paper 
discusses the improvements over version 3.1 and 
presents details of the new features and algorithms 
that were developed for version 4.1 and are used in 
version 5.1. This advanced technology makes the 
DECRdb commercial database product a precursor 
of sophisticated database management systems. 

Multimedia objects, such as large amounts of 
text, still frame images, compound documents, and 
digitized audio and video, are becoming standard 
data types in computer applications. Devices that 
scan paper, i.e. , facsimile machines, are inexpensive 
and ubiquitous. Devices that capture and play back 
full-motion video and audio are just beginning to 
reach the mass market. Capturing these objects for 
use within a computer results in many large data 
files. For example, one minute of digitized and com- 
pressed standard TV-quality video requires approxi- 
mately 50 megabytes (MB) of storage! 

To date, relational databases have been used 
successfully in storing, indexing, and retrieving 
coded numbers and characters. Relational algebra 
is an effective tool for reorganizing queries to 
reduce the number of records, e.g., from 1 million 
to 70 records, that an application program must 
search to obtain the desired information. Other 



database features, such as transaction processing, 
locking, recovery, and concurrent and consistent 
access, are essential to the successful operation of 
numerous businesses. Electronic banking, credit 
card, airline reservation, and hospital information 
systems all rely on these features to query, main- 
tain, and sustain business records. 

However, although a business might have its 
numbers and characters organized, controlled, and 
managed in a computer database, maintaining the 
paper and film storage media associated with 
database records can be costly, both in dollars and 
in human resources. Some estimates place the 
worldwide data storage business at $40 billion, and 
as much as 95 percent of the information is stored 
on either paper or film. Currently, businesses such 
as insurance, banking, engineering, and medicine 
depend on human beings to manage the filing and 
retrieval of these extensive paper and film archives. 
Human error can result in the loss of paper and 
film. Clearly, scanning the paper, storing the infor- 
mation in a computer, and making this information 
available over computer networks is a better way 
to manage paper records. This scheme allows 
(1) multiple copies to be distributed at once; (2) a 
customer file to be electronically located and 
retrieved in seconds, whereas to materialize a 
paper folder can take days; and (3) properly 
programmed computers to maintain these types 
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of information more efficiently and accurately than 
humans can. 

The idea of eliminating paper-based storage of 
business records in favor of computer storage is 
long-standing. However, only recently have techni- 
cal developments made it practical to consider cap- 
turing, storing, and indexing large quantities of 
multimedia objects. Storage robots based on mag- 
netic tape or optical disk can be configured in 
the range of multiple terabytes (TB) at the low cost 
of 45 cents per MB. Central processors based on 
reduced instruction sets are getting fast enough to 
process multimedia objects without having to rely 
on digital signal coprocessors. Processor main 
memory can be configured in gigabytes (GB). 
Document management systems, which have 
thrived over the past few years, deliver computer 
scanning, indexing, storage, and retrieval across 
local area networks. 

Until now, most multimedia objects have been 
stored in files. Document management systems 
generally use commercial relational database tech- 
nology to store the documents' index and attribute 
information, where one attribute is the physical 
location of the file. This approach has several disad- 
vantages: considerable custom software must be 
written and maintained to make the system appear 
logically as one database; application programs 
must be written against these proprietary software 
interfaces; a system based on both files and a rela- 
tional database is difficult to manage; two backup- 
and-restore procedures must be learned and 
applied; and complications in the recovery process 
can occur, if the database and file system backups 
are executed independently 

Notwithstanding these disadvantages, storing 
multimedia objects in a relational database offers 
several advantages over file system storage. 

■ Coding an application against one standard 
interface structured query language (SQL) to 
store object attribute data as wel 1 as multimedia 
objects is easier than coding against both SQL to 
manage attribute data and a file system to store 
the multimedia object. 

■ The database requires only one tool to back up 
and monitor data storage rather than two to 
maintain the database and the file system. 

■ The database guarantees that concurrent users 
see a consistent view of stored information. In 
contrast to a file system, a database provides a 



locking mechanism to prevent writers and read- 
ers from interfering with one another in a gen- 
eral transaction scheme. However, a file system 
does offer locks to prevent readers and writers 
from simultaneous file access. 

■ The database guarantees, assuming that proper 
backup and maintenance procedures are fol- 
lowed, that no information is lost as a result of 
media or machine failure. All transactions com- 
mitted by the database are guaranteed. A file sys- 
tem can be restored only up to the last backup, 
and any files created between the last backup 
and the system failure are lost. 

In the sections that follow, we present (1) the 
results of an evaluation of DEC Rdb version 3-1 for 
its ability to insert, fetch, and process multimedia 
objects; (2) a discussion of the impact of optical 
storage technology on multimedia object storage; 
and (3) design considerations for optical disk sup- 
port, transaction recovery, journal ing, the physical 
database, language, and large object data storage 
and transfer. The paper concludes with the results 
of DEC Rdb performance tests. 

Evaluation of DEC Rdb as a 
Multimedia Object Storage System 

Given the premise that production systems need to 
store multimedia objects, as well as numbers and 
characters, in databases, the SQL Multimedia engi- 
neering team members evaluated the following DEC 
Rdb features to determine if the product could sup- 
port the storage and retrieval of multimedia 
objects: 

■ Large object read and write performance 

■ Maximum large object size 

■ Maximum physical capacity available for storing 
large multimedia objects 

The DEC Rdb product has always supported a 
large object data type called segmented strings, 
also known as binary large objects (BLOBs). The evo- 
lution from support for BLOBs to a multimedia 
database capability was logical and straightfor- 
ward. In fact, the DEC Rdb version 1.0 developers 
envisioned the use of the segmented string data 
type for storing text and images in the database. 

In evaluating DEC Rdb version 3.1, we came to a 
variety of conclusions about the existing support 
for storing and retrieving multimedia objects. 
Descriptions of the major findings follow. 
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The DEC Rdb SQL, which is compliant with the 
standards of the American National Standards 
Institute (ANSI) and the International Organization 
for Standardization (ISO), and SQL/Services, which 
is client-server software that enables desktop com- 
puters to access DEC Rdb databases across the net- 
work, did not support the segmented string data 
type. Note that the most recent SQL92 standard 
does not support any standard large object mecha- 
nisms. 1 Object-oriented relational database exten- 
sions are expected to be part of the emerging SQL3 
standard. 2 

The total physical capacity for storing large 
objects and for mapping tabular data to physical 
storage devices is insufficient. All segmented string 
objects have to be stored in only one storage area in 
the database. This specification severely restricts 
the maximum size of a multimedia database and 
thus impacts performance. One cannot store a large 
number of X-rays or one-hour videos on a 2- to 3-GB 
disk or storage area. Contention for the disk would 
come from any attempt to access multimedia 
objects, regardless of the table in which they are 
stored. Although multiple discrete disks can be 
bound into one OpenVMS volume set, thereby 
increasing the maximum capacity, data integrity 
would be uncertain. Losing any disk of the volume 
would result in the loss of the entire volume set. 

The maximum size of the database that DEC Rdb 
can support is 65,535 storage areas, where each area 
can span 2 32 - 1 pages. That translates to 256 tera- 
pages (i.e., 256 X 10 12 pages) or 128 petabytes (PB) 
(i.e., 128 X lO 1 ^ bytes). At a penny per megabyte, a 
128-petabyte storage system would cost 1.28 billion 
dollars! 

The largest BLOB that DEC Rdb can maintain is 275 
TB (i.e., 275 X 10 12 bytes). A data storage rate of 
1 megabyte per second (MB/s) for motion video and 



audio translates into 8.7 years of video. However, as 
mentioned previously, the maximum size and the 
total number of objects that can be stored are lim- 
ited. As part of system testing, we successfully 
stored and retrieved a 2-GB object in a DEC Rdb data 
field. 

DEC Rdb uses a database key to reference individ- 
ual segments stored in database pages. A BLOB 
belongs to only one column of one row of a rela- 
tion. The database key value that locates the first 
segment is stored in the column of a table defined 
to represent the BLOB data type. DEC Rdb imple- 
ments segmented strings as singly linked lists of 
segments. Therefore, version 3-1 must read a seg- 
ment in order to find the next segment. This pro- 
cess has two disadvantages: (1) random positioning 
with a BLOB data stream is extremely slow, and (2) 
BLOB pages cannot be prefetched asynchronously. 
Figure 1 illustrates a DEC Rdb version 31 singly 
1 inked list segmented string implementation. 

BLOB data transfer performance of DEC Rdb ver- 
sion 31 was promising. We were able to code a load 
test that sustained 65 kilobytes per second (kB/s); a 
fetch test sustained 125 kB/s. To put these measure- 
ments in perspective, DEC Rdb is capable of insert- 
ing more than one A4-size (210 millimeters [mm] 
by 297 mm, i.e., approximately 8.25 by 11.75 inches) 
scanned piece of paper per second and capable of 
fetching more than two A4-size pieces of paper per 
second. The test was conducted by writing and 
reading 50-kB memory data buffers to and from 
magnetic storage areas defined by the DEC Rdb soft- 
ware. This experiment ignores the overhead of net- 
work delays and compression. 

DEC Rdb version 3-1 can write multiple copies 
of BLOBs, one to the target database storage area 
and one to each of the database journal files. The 
journal files provide for transaction recovery and 
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system failures, such as disk drive failures. Database 
journal files tend to be bottlenecks, because every 
data transaction is recorded in the journal. 
Therefore, writing large objects to journal files dra- 
matically impacts both the size of the journal file 
and the I/O to the journal file. 

The volume of storage required for most modest 
multimedia applications can be measured in tera- 
bytes. A magnetic disk storage system 1 TB in size 
is expensive to purchase and maintain. An alterna- 
tive storage device that provided the capacity at a 
much lower cost was required. We investigated the 
possibility of using Digital's RV20 write-once opti- 
cal disk drive and the RV64 optical library ("juke- 
box") system based on the RV20 drives. We quickly 
rejected this solution because the optical disk 
drives were interfaced to the Q-bus and UNIBUS 
hardware as tape devices. Since relational databases 
use tape devices for backup purposes only and not 
for direct storage of user data, these devices were 
not suitable. Note that physically realizing and 
maintaining a large data store is a problem for both 
file systems and relational databases. 

DEC Rdb version 31 does not support large 
capacity write once, read many (WORM) devices, 
which are suitable for storing large multimedia 
objects. Version 31 has no optical jukebox support 
either. 

Storage Technology Impact 
When we evaluated DEC Rdb version 3.1, a 1-TB mag- 
netic disk farm was orders of magnitude more 
expensive than optical storage. Large format 12- or 
14-inch (i.e., 30.5- or 35.6-centimeter) WORM opti- 
cal disks have a capacity of 6 to 10 GB. The WORM 
drives support removable media. These drives can 
be configured in a jukebox, where a robot transfers 
platters between storage slots and drives. A fully 
loaded optical jukebox, which includes optical disk 
drives and a full set of optical disk platters, of 
approximately 1-TB capacity costs about $400,000, 
i.e., $0.40 per MB. By comparison, Digital's RA81 
magnetic disk drive, for example, has a capacity 
of 500 MB and costs $20,000. Thus, to store 1 TB of 
data would require 2,000 RA81 disk drives at a total 
cost of $40 million, i.e., $40.00 per MB! 

How big is one terabyte? Assume, conservatively, 
that a standard business letter scanned and com- 
pressed results in an object that is 50 kB in size. 
Therefore, 1 TB can store 20 million business let- 
ters, i.e., 40,000 reams of paper at 500 sheets per 
ream. A ream is approximately 2 inches (51 mm) 



high, so 1 TB is equivalent to a stack of paper 80,000 
inches or 6,667 feet or 1.25 miles (2 kilometers) 
high! The total volume of paper is 160 cubic yards 
(122 cubic meters). A 1-TB optical disk jukebox is 
about 3 to 4 cubic yards (2.3 to 3 cubic meters). 
Assuming TV-quality video, 1 TB can store 308 
hours or approximately 12 days of video. Full- 
motion video archives suitable for use in the broad- 
cast industry require petabytes of mass storage. 

The gap between affordable and practical config- 
urations of optical disk jukeboxes and magnetic 
disk farms has closed considerably since late 1992. 
Juxtaposing equal amounts (700 GB) of magnetic 
and optical storage, including storage device inter- 
connects, installation, and interface software, 
reveals that magnetic disk storage is about five 
times more expensive than optical storage. The 
major disadvantage of optical jukebox storage is 
data retrieval latency related to platter exchanges. 
This latency, which is approximately 15 seconds, 
varies with the jukebox load and how data is 
mapped to different platters. 

Mass storage technology, including device inter- 
connects, combines different classes of storage 
devices into storage hierarchies. Storage manage- 
ment software continues to be a challenging aspect 
of large multimedia databases. 

To provide 1 TB of m ass storage capacity for rela- 
tional database multimedia objects at reasonable 
cost, we conducted a review of third-party optical 
disk subsystems, hardware, and device drivers for 
VAX computers running the OpenVMS operating 
system. A characterization of the available optical 
disk subsystems revealed three basic technical alter- 
natives. 

1. Low-level device drivers provided by the drive 
and jukebox manufacturers. 

2. Hardware and software that model the entire 
capacity of an optical disk jukebox as one large 
virtual address space. 

3. Write-once optical disk drives interfaced as stan- 
dard updatable magnetic disks. The overwrite 
capability is provided at either the driver or the 
file-system level, where overwritten blocks are 
revectored to new blocks on the disk. For exam- 
ple, consider a file of 100 blocks created as a sin- 
gle extent on a WORM device. When requested to 
rewrite blocks 50 and 51, the WORM file system 
writes the new blocks onto the end of al l blocks 
written. The system also writes a new file header 
that contains three file extents: blocks 0 to 49 



Digital Technical Journal Vol. 5 /Va 2 Spring 1993 



53 



Multimedia 



stored in the original extent; blocks 50 to 51 
stored in the new extent; and blocks 52 to 100 
stored as the third extent. Obviously, files that 
are updated frequently are not candidates for 
WORM storage. However, immutable objects, 
such as digitized X-rays, bank checks, and health- 
benefit authorization forms, are ideal candidates 
for WORM storage devices. 

As a result of this investigation, we decided that 
using write-once optical devices, interfaced as stan- 
dard disk devices, was the best solution to provide 
optical storage for multimedia object storage. This 
functionality is being met with commercially avail- 
able optical disk file and device drivers. 

In the future, WORM devices may be superseded 
by erasable optical or magnetic disks. However, 
experts expect that WORM devices, like microfilm, 
will continue to be useful for legal purposes. 

Design Considerations 

The tamperproof nature of WORM devices is an 
asset but causes special problems in database sys- 
tem design. The evaluation of DEC Rdb version 3.1 
indicated that several features needed to be added 
to the DEC Rdb product to make it a viable multime- 
dia repository. This section describes the design of 
the new multimedia features included in DEC Rdb 
versions 4.1 through 5.1. 

Mass Storage 

DEC Rdb version 4.1 supports WORM optical disks 
configured in standalone drive or jukebox configu- 
rations. DEC Rdb permits database columns that 
contain multimedia objects to be stored or mapped 
to either erasable (magnetic or optical disk) or 
write-once (optical disk) areas. The write-once 
characteristic can be set and reset to permit the 
migration of the data to erasable devices. No 
changes to application programs are required to 
use write-once optical disks, including jukeboxes. 

The main design goals for WORM area support 
were to 

■ Reduce wasted optical disk space by taking into 
account the write-once nature of WORM devices 

■ Not introduce DEC Rdb application program- 
ming changes for WORM areas 

■ Maintain the atomicity, consistency, isolation, 
and durability (ACID) properties of transactions 
for WORM devices 



■ Maintain comparable performance, allowing for 
hardware differences between optical and mag- 
netic devices 

DEC Rdb uses the optical disk file system to cre- 
ate, extend, delete, and close database storage files 
on WORM devices. Although this approach uses the 
block revectoring logic in the optical disk file sys- 
tem, minimal space is wasted. When writing blocks 
to WORM devices, DEC Rdb explicitly knows that 
blocks can be written only once and bypasses the 
revectoring logic in the optical disk file system. 

Nonetheless, DEC Rdb software could waste 
space in two major ways. First, when DEC Rdb cre- 
ates a storage area on an erasable medium (e.g., 
a magnetic or erasable optical disk), the database 
pages are initialized to contain a standard page for- 
mat, with page numbers, area IDs, checksums, etc. 
Preinitialized database pages help to determine cor- 
rupted database pages. However, preinitializing 
database pages on write-once media makes little 
sense. The second way in which DEC Rdb could 
waste write-once optical disk pages is to use stor- 
age allocation bit maps for space management 
(SPAM). SPAM pages are used to keep track of free 
and used pages. As records are added to and deleted 
from the database, the SPAM bit maps are constantly 
updated. SPAM pages are maintained within each 
database file. With write-once devices, a page can 
be used only once. Again, it makes no sense to 
update SPAM pages for write-once media. 

To eliminate needlessly wasting space on write- 
once media, DEC Rdb does not preinitialize WORM 
pages. As a general rule, WORM areas should not 
contain any updatable data structures. DEC Rdb 
maintains WORM storage space allocation in the 
database root file. The database root file should 
always reside on a magnetic disk, because the root 
file is frequently updated and magnetic disks yield 
higher performance. The ciusterwide object man- 
ager mechanism ensures that the pointer to the end 
of the written area is consistent across a cluster. 

SPAM pages, although disabled for write-once 
areas, are in fact allocated anyway. The reason 
for allocating SPAM pages in a write-once area is to 
provide the ability to migrate the contents of the 
storage area to an erasable device. The SPAM pages 
simply need to be rebuilt to reflect the space uti- 
lization at the point of conversion. 

This write-once characteristic was the basis for 
several enhancements to the buffer manager and 
page allocation algorithms. Given that a free WORM 
page has never been written to, the buffer manager 
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simply materializes an initialized buffer in main 
memory for write operations without having to 
first read the page from disk. In the case of page 
allocation for magnetic disks, DEC Rdb must scan 
SPAM pages in search of enough free storage space 
to satisfy a write operation. The scanning algorithm 
is much simpler for write-once areas; to store new 
records, DEC Rdb allocates one more page at the 
end of the written portion of the area to a process. 
DEC Rdb maintains such allocated pages in a queue 
called the marked WORiVl page queue on a per- 
process basis. Whenever a WORiM page is written 
to disk, that page is taken off the marked WORM 
page queue. An attempt to store a record checks 
the queue before allocating new WORM pages to 
storage. Facilities exist to allocate many WORM 
pages in one operation, thus minimizing the num- 
ber of writes to the root file. 

By explicitly taking into account the write-once 
characteristic of the device, DEC Rdb greatly 
reduces wasted space, keeping optical disk read 
and write performance high. 

Transaction Recovery 

To understand the discussion of transaction recov- 
ery, the concepts of first- and second-class records 
must be understood. Both alphanumeric records 
and BLOB segments are stored in database pages. 
Alphanumeric records are first-class records and 
thus have identities in tables; these records are the 
rows. First-class records are required to be on a 
medium that permits update (either magnetic disk 
or erasable optical disk). Ail relation tuples are first- 
class records. Second-class records, such as BLOBs, 
have no identities of their own. BLOBs can exist only 
within the domain of an alphanumeric record and 
are pointed to by first-class records. Second-class 
records may be located in WORM areas. 

Multimedia objects can be stored as second-class 
records in either write-once or erasable areas. 
However, due to transaction recovery constraints, 
the rows of relations must be stored in magnetic 
disks as first-class records. 

If an update transaction against the database is 
aborted, then the database must restore the state of 
all database areas to pretransaction state. Regard- 
less of the transaction recovery scheme employed, 
e.g., hybrid undo-redo, the effects of an uncom- 
mitted transaction to write-once media may have to 
be undone. 

By definition, a write transaction on write-once 
media, once complete, can never be undone. In 



cases where a transaction fails and the transaction 
has written data to a write-once area, DEC Rdb 
employs a logical undo operation. This operation 
de-references the database key that points to the 
BLOB data written as part of the failed transaction. 
An example helps to illustrate how the logical undo 
operation works. 

1. Consider row R of table T, which contains a col- 
umn defined as data type BLOB. 

2. The BLOB storage map indicates that the large 
objects are stored in a write-once area. 

3. A process starts a transaction and updates the 
row storing a BLOB in the write-once area. 

4. For some reason the transaction aborts. 

5. Recovery nullifies the value of the database key 
that locates the first page of the BLOB. 

The write-once pages can never be reused and 
will never again be allocated. Nothing points to or 
references data written as part of an aborted 
transaction. 

This transaction recovery scheme introduces the 
interesting phenomenon of WORM holes. Consider 
the following scenario: 

■ A write-once area has the first 106 pages written 
and allocated. 

■ Process X starts a transaction that writes a BLOB 
segment to the write-once area. 

■ Page 107 is allocated for process X. 

■ Later in time, process Y starts a transaction to 
store a BLOB in the same write-once area. 

■ Process Y causes pages 108 to 120 to be allo- 
cated, data is written, the transaction commits, 
and process Y disconnects from the database. 

■ At this point, process X decides to roll back its 
transaction. 

■ Page 107 remains in a preinitialized state. 

Page 107 can never be allocated to store BLOB data. 
Recall that DEC Rdb manages space on write-once 
devices by maintaining an end-of-area pointer to 
keep track of pages that have been written. Zero- 
filled pages that will never be allocated are called 
WORM holes. WORM holes are interesting because 
DEC Rdb utilities, such as verify, expect to find all 
allocated pages in a standard format. The utilities 
have been modified to ignore empty pages on 
write-once areas. 
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Journaling Design Considerations 
An effective database management system guar- 
antees the recovery of a database to a consistent 
state in the event of a major system failure, such 
as media failure. Hence, full and incremental back- 
ups must be performed at regular intervals, and 
the database must record or keep a journal file of 
transactions that occur between backups. In DEC 
Rdb, the after image journal (A I J) file records all 
transactions against the database since the last 
backup. Also, to recover from a system failure, the 
database must keep track of all outstanding or 
pending transactions. The recovery unit journal 
(iUIJ) file records the state and data associated with 
all pending transactions. 

Journal files are heavily utilized in a database 
management system. Contention for the journal 
files comes from every process that is updating 
the database. To be completely recoverable, the 
database management system must record BLOB 
data, as well as alphanumeric data, to both the AI.J 
and the RUT files. Because multimedia objects are 
large, eliminating the need co write these objects to 
the journal files is desirable. The double-write trans- 
action negatively impacts the performance of the 
application storing the object and taxes the journal 
file, one of the most burdened resources in the 
database. 

As discussed in the Transaction Recovery sec- 
tion, DEC Rdb uses logical undo operations to undo 
aborted transactions. In addition to the minimal 
processing required to de- reference a database key 
pointing to the WORM area pages, DEC Rdb automat- 
ically disables RUJ log writes for WORM area records. 
This is another advantage of using WORM devices 
for multimedia objects. 

Recording multimedia objects in the AIJ file is 
not so straightforward DEC Rdb uses the AIJ file 
for media recovery, as well as for transaction 
recover)'. By definition, keeping a media recovery- 
journal forces twice the number of I/O operations, 
each to a separate device. DEC Rdb must write 
the multimedia object to the storage area desig- 
nated for the multimedia object and write a copy of 
the object to the AIJ file. If the primary storage 
device that contains the object fails, the database 
administrator can apply the last full backup of 
the storage area, followed by any subsequent incre- 
mental backups, and roll forward through the 
AIJ journal file to recover the data. If a multi- 
media database is to be completely recoverable 
and consistent, then multimedia objects must be 



recorded in the AIJ file. Since they can never be 
erased, WORM optical disks might be the best 
devices to write an object (or a journal file) to. Kven 
though a jukebox can mis feed and permanently 
damage the media, disks in a jukebox can be disk 
shadowed. The trade-off is doubling the I/O versus 
risking data integrity. Rather than legislate a policy, 
DEC Rdb permits applications to disable AIJ logging 
for BLOBs, thus transferring the risk to individual 
applications. 

Database Physical Design Considerations 
The original design of segmented strings specified 
a singly linked list, where the segments were 
written one at a time, as shown in Figure 1. When 
writing a new segment, the previous segment 
had to be updated with a pointer value that identi- 
fied the location of the new segment. For example, 
to store a BLOB with two segments Rl and R2, 
the old algorithm stored Rl, stored R2, and then 
modified Rl to point to R2. Although this algorithm 
does not waste space on a magnetic disk, it does 
waste space on write-once optical disk. Segment 
Rl must be rewritten to disk with a pointer to 
segment R2. 

If we impose the dependency between the two 
stores that R2 must be stored before Rl, the store 
dependency for BLOBs becomes a reverse order 
of segments. Storing segments in reverse order 
requires buffering all segments of a multimedia 
object. Whereas buffering the entire object in main 
memory may be feasible for small multimedia 
objects, main memory is not large enough to buffer 
audio and video data objects. The singly linked 
list method thai DEC Rdb used prior to version 4.1 
is not well suited for WORM devices. Therefore, we 
redesigned the format of BLOBs in WORM areas to 
eliminate the need to buffer large amounts of data. 

The new design replaces the singly linked list 
with BLOB segment pointer arrays and BLOB data 
segments. The segment pointer array maintains 
a list of database keys that locate each segment, in 
order, for a BLOB, as illustrated in Figure 2. Because 
segment pointer arrays are stored as a singly linked 
list, the pointer arrays can become large. 
Application data is stored in BLOB data segment"! 
The new method buffers and writes the BLOB seg- 
ment pointers to disk after assigning the segmented 
string to a record. 

Besides eliminating the waste problem for write- 
once devices, the segment pointer array has other 
advantages. DEC Rdb reads the pointer array into 
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Figure 2 Rdb Version 4.2 Pointer Array Segmented String Implementation 



memory when an application accesses a BLOB. DEC 
Rdb can, therefore, quickly and randomly address 
any segment in the BLOB. Also, DEC Rdb can begin 
to load segments into main memory before the 
application requests them. This feature benefits 
applications that sequentially access an object, 
such as playing a video game. 

Storage Map Enhancements for BLOBs 
Designers addressed several issues related to stor- 
age mapping. The major problems solved involved 
capacity and system management, jukebox perfor- 
mance, and the failover of full volumes. 

Capacity and System Management DEC Rdb can 
map user data, represented logically as tables, rows, 
and columns, into multiple files or storage areas. 
Besides increasing the amount of data that can 
be stored in the database, spreading data across 
multiple devices reduces contention for disks and 
improves performance. However, as mentioned in 
the section Evaluation of DEC Rdb as a Multimedia 
Data Storage System, prior to DEC Rdb version 4.1, 
only one storage area could be used for storing 
BLOB data. All BLOB columns in the database were 
implicitly mapped into the single area, which 
severely limited the maximum amount of multi- 
media data that could be stored in DEC Rdb. 

Prior to new multimedia support for BLOBs, DEC 
Rdb restricted the direct storage of a particular 
table column to one DEC Rdb storage area (i.e., file). 
This partitioning control is accomplished by means 
of the DEC Rdb storage map mechanism, as shown 
in the following code example: 



Create storage map B L 0 B_M A P 
Store Lists 

in R E S U M E_A R E A 

for (PLACEME NT_H I STORY, 
CANDIDATES. RESUME) 
in PHOT 0_A R E A 

for (CANDIDATES. PICTURE) 
in RDBSSYSTEM; 

This code directs the BLOB data from the table 
PLACEMENTJH1STORY and the column RESUME of 
the table CANDIDATES to be stored in the area 
RESUME_AREA and the BLOB column PICTURE of 
the table CANDIDATES to be stored in the area 
PHOTO_AREA. The remaining BLOB data in the 
database is stored in the default RDB$SYSTEM area. 

Restricting the storage of all BLOBs across the 
entire database schema to a single file or database 
area was clearly undesirable. The size of the area 
would be limited to the largest file that could be 
created by the OpenVMS operating system and the 
mass storage devices available. The limited map- 
ping of one BLOB area mapped to one disk 
can be circumvented by using the OpenVMS sys- 
tem's Bound Volume Set mechanism. This mecha- 
nism allows n discrete disks to be bound into one 
logical disk. DEC Rdb can then create a single stor- 
age area on the logical disk that spans the bound 
set of disks. 

However, although the volume set mechanism 
solves the problem of limited area mapping, serious 
limitations exist in the database system administra- 
tion and recovery processes. All database-related 
facilities operate at the granularity of a database 
storage area. Thus, if one disk in a 10-disk volume 
set is defective, DEC Rdb would have to restore all 
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10 disks. Not only does restoring data on function- 
ing disks waste processing time, but during the 
restore operation, applications are stalled for access 
at the area level. This situation introduces concur- 
rency problems for on-line system operations. 

DEC Rdb version 4.1 and successive versions 
solve the capacity problem by (1) permitting the 
definition of multiple BLOB storage areas, (2) bind- 
ing discrete storage areas into storage area sets, and 
(3) providing the ability to map or to vertically 
partition individual BLOB columns to areas or area 
sets. Applications can set aside a disk or a set of 
disks for storing employee photographs, X-rays, 
video, etc. The alphanumeric data and indexes 
can be stored in separate areas as well. Figure 3 
depicts the employee photograph column being 
mapped to the EMP_PHOTO_l, EMP_PHOTO_2, and 
EMP_PHOTO_3 storage area set. All alphanumeric 
data in the table EMPLOYEES is assumed to be 
mapped to storage area A. 

Coding this example results in 

Create storage map B L 0 B_M A P 
Store Lists 

in ( EMP_PH0T0_1 , EMP_PH0T0_2 , 
EMP_PH0T0_3) 
for (EMPLOYEES. PHOTOGRAPH) 
in RDBSSYSTEM; 

This code directs the BLOB data, i.e., the column 
PHOTOGRAPH from the table EMPLOYEES, to be 
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stored in the three specified areas EMP_PHOTO_l, 
EMP_PHOTO_2, and EMP_PHOTO_3- 

The ability to define multiple BLOB storage areas 
and to bind discrete areas into a storage set elimi- 
nates the BLOB storage capacity limitation in DEC 
Rdb. Consider the storage problem of storing 1 MB 
of medical X-rays as part of a patient record. Prior to 
DEC Rdb version 4.1, the limited one-BLOB storage 
area could store approximately 2,000 X-rays on a 
2-GB disk device. The features included in version 
4.1 allow the creation of a DEC Rdb storage area set 
that spans multiple disk devices. Also, adding stor- 
age areas or disks to a storage area set can expand 
the capacity initially defined for the column. 

Jukebox Performance Problems When a storage 
area set is defined using the SQL storage map state- 
ment, DEC Rdb implements a random algorithm 
to select a discrete area or disk from the set to store 
the next object. Since multiple processes access 
multimedia objects across the entire set, a random 
algorithm that evenly distributes data across the 
disks in the area set reduces contention for any 
one disk. 

Using a random algorithm to select from a set 
of platters in a jukebox is extremely inefficient 
A jukebox comprises one to five disk drives with 50 
to 150shelf slots where optical disk media is stored. 
A storage robot exchanges optical disk platters 
between drives and storage slots. As described ear- 
lier, a full platter exchange — spin down the platter 
currently in the drive, eject the platter, insert a new 
platter, spin up the new platter — takes approxi- 
mately 15 seconds. Each optical disk surface, i.e., 
side of a platter, is modeled as a discrete disk to the 
OpenVMS operating system. Consider, for example, 
ten storage areas defined on optical disks in the 
jukebox and mapped into a storage area set. All 
patient X-rays from a single table in the database arc 
to be stored in this area set. Each new X-ray inserted 
in the database causes DEC Rdb to randomly select a 
disk surface in the jukebox, which probably results 
in a platter exchange. Consequently, each X-ray 
insertion takes 15 seconds! 

The solution to the jukebox performance prob- 
lem was not to eliminate random storage area selec- 
tion, which works successfully with fixed-spindle 
devices. Rather, the solution was to accommodate 
an alternate algorithm that sequentially filled the 
disks in an area set. Using DEC Rdb, applications can 
specify random or sequential loading of storage 
area sets as part of the storage map statement. 
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Contention for a single optical disk in a jukebox is 
a far more desirable situation, with respect to 
latency, than causing one platter exchange per 
object stored. 

When multiple users simultaneously issue 
requests to read multimedia objects stored in a 
jukebox, long delays occur, whether the storage 
area is loaded sequentially or randomly. Using a 
transaction monitor to serialize access to the 
database helps eliminate jukebox thrashing and 
improve the aggregate performance of the database 
engine. 

Fen lover of Full Volumes The introduction of 
storage area sets gave rise to another problem: 
What happens when one area in the set becomes 
full? Normally, within the DEC Rdb environment, 
disk errors that result from trying to exceed the 
allocated disk space are signaled to the application 
so that the transaction can be rolled back (dis- 
carded). When related to storage area sets, how- 
ever, the error is just an indication that a portion of 
the disk space allocated to the column has been 
exhausted and that processing should continue. 
Also, since multimedia objects tend to be exceed- 
ingly large, great amounts of data may have already 
exhausted cache memory and been written back to 
the WORM media, even though the database trans- 
action has not committed. Handling such an error 
by signaling to the application and expecting the 
application to roll back and retry the transaction 
would result in the waste of a large number of 
device blocks that have already been burned. Thus, 
DEC Rdb had to implement a new scheme. 

DEC Rdb now implements full f ailover of an area 
within the area set. Thus, when an area becomes 
full. DEC Rdb traps the error, selects a new area in 
the set, and writes the remaining portion of the 
BLOB being written to the new area. This area 
failover works whether the storage allocation is 
random or sequential. In addition, the area that 
is now full is marked with the attribute of full, and 
the clusterwide object manager of DEC Rdb main- 
tains this attribute consistently throughout the 
cluster. Consequently, writers to the database will 
consider the area unavailable for future BLOB store 
operations. Further, the DEC Rdb database manage- 
ment utilities can remove the attribute if additional 
space is made available to the database area (e.g., if 
DEC Rdb moves BLOBs from area A to another copy 
of area A that resides on a device with twice the 
capacity). 



Language Design Considerations 
SQL, the ISO/ANSI standard relational database 
structured query language, is well suited to 
expressing queries against alphanumeric data 
yet hardly begins to address the needs of multi- 
media objects. Putting aside the fact that sampled 
data (i.e., a scanned image) is more difficult to 
query than coded data (e.g., text coded in ASCII), 
SQL cannot provide data compression and ren- 
dition capabilities for multimedia objects. 
Multimedia object processing is better suited to 
a language like C or C + + . Ideally, SQL would sup- 
port the ability to define objects and to associate 
methods with those objects. SQL3 is a new version 
of the SQL standard that the standards organizations 
are just beginning to work on. SQJ.3 contains the 
mechanism to define abstract data types and to exe- 
cute external procedures as part of SQL statements. 
However, SQL3 will not become a standard for four 
to five years. 

As discussed previously, DEC Rdb SQL lacks 
support for the segmented string or BLOB data 
type that was available in the Rdb relational engine. 
A new DEC Rdb SQL data type, LIST OF BYTE 
VARYING, was designed based on the native Rdb 
segmented string data type. The data access mecha- 
nism for the LIST OF BYTE VARYING data type is 
a list cursor, which operates like a table cursor- 
open the cursor, fetch segments of a BLOB, and 
close the cursor. This new data type with asso- 
ciated access mechanism was also added to 
SQL/Services. SQL/Services software enables remote 
clients on a network, such as personal com- 
puters, to attach to remote DEC Rdb databases. 
The ability to scroll or to randomly position the 
list cursor allows positioning at a particular data 
segment within the multimedia object stream with- 
out having to physically read through the entire 
data stream. 

Although applications can program directly to 
list cursors, this interface was cumbersome and did 
not offer any object typing or processing. The list 
cursor mechanism does not present the straightfor- 
ward byte-stream interface that is common in most 
file systems. Applications want to store objects, 
such as images and compound documents, not 
BLOBs. Data compression was another important 
consideration. Multimedia objects should be com- 
pressed on the client side of the network; then, 
compressed bits are transferred through the net- 
work, servers, and disks. The objects should be 
decompressed when they are to be rendered for 
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display. Finally, the enormous size of multimedia 
objects saturates main memory resources on per- 
sonal computers, so application developers must 
use disk storage to buffer as well as persistently 
store multimedia objects. 

The limitations of the LIST OF BYTE VARYING data 
type and the list cursor data access mechanism led 
to the development of multimedia object exten- 
sions. SQL Multimedia is an object library that oper- 
ates against SQL and SQL/Services. SQL Multimedia 
allows application developers to classify or type 
multimedia data types (e.g., IMAGE, TEXT, and 
COMPOUND_DOCUMENT) and to specify the data 
format within a type or class. Because no widely 
agreed upon multimedia object encodings or for- 
mats exist, we decided not to limit the types of data 
encoding or formats that could be stored in the 
database. For example, the database can store an 
image in Digital Document Interchange Format 
(DDIF) or Tagged Image File Format (TIFF). The 
option of defining a canonical encoding and format 
for each object class was too restrictive. 

In both the SQL and the SQL/Services versions, 
the SQL Multimedia insert and fetch calls operate 
within the bounds of a transaction. All multimedia 
objects enjoy the same rights and privileges as 
alphanumeric data types in the database, with 
respect to concurrent access, recovery, etc. 

A process that attaches to a DEC Rdb database 
can specif)' that an authorization identifier or a 
default identifier be created and referenced by the 
M RDB$ HANDLE " symbolic label. A transaction can 
be started explicitly or a default transaction begins. 
To operate within the bounds of the default trans- 
action, the SQL Multimedia routines required 
access to the default authorization identifier 
RDB$HANDLE. A new SQL compile time switch, for 
the SQL module language and precompilers, causes 
this identifier to be defined in a global address 
space. The SQL Multimedia routines can thus access 
the value of the identifier. If a distributed transac- 
tion identifier is not passed ro the SQL Multimedia 
routines, the SQL Multimedia operation is executed 
using the default transaction. 

SQL Multimedia improves the cumbersome list 
cursor interface by supporting the following object 
sources and destinations: 

■ The entire object sourced from or deposited to 
main memory 

■ The object buffered through main memory 

■ A file 



SQL Multimedia handles file I/O operations 
across many different software environments, 
including the MS-DOS, Windows, Macintosh, 
ULTRIX, and Open VMS operating systems. SQL 
Multimedia preserves file attributes on insert oper- 
ations. For example, the Macintosh file system's 
resource fork, which contains the name and ver- 
sion of the application to be launched when the 
object is accessed by a user, is preserved. If another 
Macintosh user fetches the object to a local file, 
then SQL Multimedia restores the file including 
the resource fork. Assuming the second user has 
the same application, the user can now access 
and manipulate the multimedia object, e.g., a com- 
pound document or a QuickTime video file. Rules 
and default file organizations exist for the case 
where a user inserted a file from an OpenVMS 
system and another user causes the object to be 
fetched to a different client file system, say on a 
PC. Application programmers can direct SQL 
Multimedia to override the default file attributes. 

Although SQL Multimedia handles disparate file 
system I/O, at present, it does not convert multime- 
dia object formats or encodings. Images captured 
and stored in DEC Rdb in DDIF are delivered to each 
client in DDIF. 

SQL Multimedia makes it easy for application 
programmers to insert and fetch compound docu- 
ments to and from the database. The buffered 
I/O data stream conforms to Digitals Compound 
Document Architecture (CDA) stream management 
interface. Fetching a compound document using 
the buffered I/O interface, SQL Multimedia returns 
the address of a procedure entry mask, a data buffer 
pointer, and the buffer length. These returned argu- 
ments can be passed to the CDA viewer in the 
DECwindows environment. The viewer then repeat- 
edly calls the SQL Multimedia buffer-fill procedure 
until the object has been transferred to the viewer 
and displayed. 

In addition, SQL Multimedia provides object- 
specific processing for image and text objects. Disk 
image objects formatted according to DDIF and 
main memory objects formatted according to 
Digitals image toolkit DECimage Application 
Services (DAS) can be processed on either fetch 
or insert operations. SQL Multimedia leverages 
the capabilities of DAS software to provide image 
processing, e.g., compression, decompression, 
scaling, and dithering. When an image is inserted 
or fetched, SQL Multimedia object processing 
arguments permit the specification of image 
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process steps and parameters. The DAS toolkit 
supports Comite Consultatif Internationale de 
Telegraphique et Telephonique (CCITT) compres- 
sion (a ubiquitous compression standard for fac- 
simile machines) for bitonal images and Joint 
Photographic Experts Group (JPEG) compression 
(an ISO/ANSI standard) for multispectural images. 

To improve application performance, SQL 
Multimedia can generate multiple rendered ver- 
sions of an image that are stored in a single database 
field. Therefore, a user can store the original image, 
retaining its fidelity, and also store a miniature 
version of the image for fast access or browsing pur- 
poses. For example, consider a personnel applica- 
tion where 90 percent of the fetches for employee 
photographs are to be displayed in a passport -size 
format on an employee information form. Jf 
the capture portion of the application stored the 
original employee photograph and directed SQL 
Multimedia to generate and store a passport-size 
rendered version in addition to the original, at fetch 
time, the I/O operations required to transmit the 
image to the employee form would be reduced. 
Storing multiple rendered versions would also elim- 
inate using CPU time to scale the fetched image. 

System Testing and Evaluation 

After the multimedia engineering of the DEC Rdb 
product was complete, we conducted several test- 
ing activities to determine the performance and 
capacity boundaries. The performance work pre- 
sented is not complete but is offered as an indica- 
tion of the multimedia object access capabilities of 
the DEC Rdb software. 

In the debit credit domain, the Transaction 
Processing Performance Council ( IPC) tests pro- 
vide a standard procedure to measure the perfor- 
mance of one database as compared to another. 
However, no standard multimedia database per- 
formance tests exist. The performance of a DEC 
Rdb multimedia database is influenced by many 
variables, including the processor, mass storage 
medium, database design, object sizes, and work- 
load. The performance data presented in this paper 
should be used only as a guide. 

Performance Testing 

For performance testing we used a VAX 6360 pro- 
cessor (relatively slow by today's standards) config- 
ured with 128 MB of main memory, an HSC50 
storage interconnect processor with 16 RA70 



magnetic disks, 6 RA92 magnetic disks, and 2 ESE20 
solid-state disks. The total mass storage available 
for building databases was 10 GB. We evaluated 
the SQL performance of DEC Rdb version 4.2 Field 
Test 1 (FT1) and SQL Multimedia version 1.0 Field 
Test 2 (FT2), and generated the SQL/Services remote 
client data fetch and insert performance data for 
DEC Rdb version 4.1 Field Test 4 and SQL Multimedia 
version 1.0 FT2. 

This performance data should be used as a guide- 
line, because the field-test software contained 
implementation errors that affected performance 
but were corrected in the released products. As pre- 
sented in Table 1, using the released version of DEC 
Rdb, we are able to sustain a 300-kB/s throughput 
from a magnetic disk DEC Rdb storage area, across 
an Ethernet network, to a DECstation 5240 work- 
station. This test demonstrates fetching a software 
motion pictures (SMP) video clip out of the data- 
base for display on an ULTRIX-based workstation/ 
Although the video was sampled at 15 frames per 
second, we can play back the video clip at 20 
frames per second! The performance measured for 
an SQL/Services fetch was 57.7 kB/s, as shown in 
Table 2. We expect to conduct similar performance 
tests on a DEC 7000 AXP processor. 

The performance test inserted and fetched 50-kB 
records. Fifty kilobytes is a conservative estimate of 
a compressed A4-size piece of paper, probably the 
most prevalent object to be stored in multimedia 
databases. For both the distributed SQL/Services 
client and the local SQL interface, 50-kB main mem- 
ory buffers were the sources and destinations for 
the inserts and fetches. 

We built several 50-MB databases, varying data- 
base design parameters such as page and buffer 
sizes, to determine the fastest set ol parameters 
for the large object performance test. Using the 
largest page and buffer sizes yielded the best perfor- 
mance. The database table was organized into three 
columns: two key columns and a BLOB column. The 
BLOB column was mapped to a storage area set con- 
sisting of multiple magnetic storage disks. 

After we established the best database organiza- 
tion, we built many 3- to 10-GB databases by 

■ Varying the number of processes executing 
insert and fetch operations 

■ Varying the number of tables in the database 

■ Varying the number of inserts and fetches per 
transaction 
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Table 1 SQL Performance 



SQL Insert Performance 
Number of Processes 

Performing Insert Number of Number of Inserts Throughput 

Operations Tables per Transaction AIJ (kB/s) 

1 11 No 83.0 

1 1 10 No 103.4 

1 11 Yes 48.0 

1 1 10 Yes 55.9 

3 3 32 No 295.3 

6 6 32 No 5337 

10 10 32 No 601.5 



SQL Fetch Performance 
Number of Processes 

Performing Fetch Number of Number of Fetches Throughput 

Operations Tables per Transaction AIJ (kB/s) 

1 1 10 No 194.0 

1 11 No 184.0 

1 11 Yes 181.0 

1 1 10 Yes 192.5 



Table 2 SQL/Services Performance 

SQL/Services Insert Performance 
Number of Processes 

Performing Insert Number of Number of Inserts Throughput 

Operations Tables per Transaction AIJ (kB/s) 

1 1 1024 No 44.0 

4 4 32 No 91.9 



SQL/Services Fetch Performance 
Number of Processes 

Performing Fetch Number of Number of Fetches Throughput 

Operations Tables per Transaction AIJ (kB/s) 

1 1 1024 No 57.7 

4 4 32 No 142.3 



■ Enabling and disabling AIJ journaling 

■ Inserting and fetching from an SQL/Services 
client or using SQL for local database access 

When we conducted the performance tests, the 
computer was dedicated to our task; no other activ- 
ity was taking place. A simple contention test, 
where multiple readers simultaneously fetch 



objects from a single table, and a more complicated 
update test, where multiple writers are simultane- 
ously updating one table, have yet to be fabricated 
and run. 

To put some of the performance results pre- 
sented in Table 1 into perspective: the tested config- 
uration can sustain approximately 600 kB of insert 
bandwidth, which translates into twelve 5()-kB 
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A4-size pieces of paper per second. Even a single 
process scanning paper at 103.4 kB/s can keep up 
with some of the fastest paper scanners available. 

Also, scanning both sides of a compressed bank 
check (scanned at 200 dots per square inch) results 
in an object size of about 20 kB. Therefore, the par- 
ticular configuration we tested could store 30 
checks per second with multiple processes, and 
6 checks per second with a single process. 

Capacity Testing 

We conducted two capacity tests. The first stored 
and fetched a 2-GB object in a DEC Rdb field, and the 
second built a 50-GB database. A 2-GB known pat- 
tern was generated in virtual memory. DEC Rdb 
wrote this object, with no AI J, to a field in an empty 
database. The BI.OB column was mapped to three 
disks, totaling 2.5 GB of storage. To avoid having to 
sustain storage area or file extensions, the storage 
area set was defined to be 2.3 GB. DEC Rdb was able 
to successfully insert and fetch the 2-GB object. 

To demonstrate the capacity that could be 
achieved with SQL Multimedia, DEC Rdb, and opti- 
cal storage, we built a 50-GB database. The hard- 
ware configuration consisted of the following: 

■ A VAX 4000 iModel 500, with 6 GB of magnetic 
disk and 128 MB of main memory 

■ A Kodak Automated Disk Library Model 6800, 
with 100 GB of storage (with a maximum capac- 
ity of L.2 TB) 

■ DEC Rdb version 4.2 Field Test 0 

■ SQL Multimedia version 1.0 FT2 

■ Perceptics LaserStar optical disk software 

Starting with a backup of a 2-GB manufacturing 
database that was used by Digital's Mass Storage 
Group, DEC, Rdb added an SQL Multimedia column 
to a table that contained over 550,000 rows. DEC 
Rdb then mapped the column to five platters, mod- 
eled as ten 9.5-million-block (5.1-GB) magnetic 
disks to the OpenVMS operating system, using the 
sequential load algorithm. An update table cursor 
was devised that returned between 2,000 to 3,000 
rows. LJsing SQL Multimedia, DEC Rdb inserted 
images representing the disk assembly process 
until the storage was full. 

Conclusion 

The multimedia features that have been added to 
Rdb are in direct support of the increasing demand 
for computer data storage and indexing of multi- 



media object types (i.e., text, still images, com- 
pound documents, audio, and video). Relational 
database systems must expand mass storage device 
support, database physical database design, lan- 
guage functionality, and performance to manage 
the variety of today's information. The development 
of this advanced technology in Digital's DEC Rdb 
product provides desktop computer-to-optical 
disk jukebox integration by means of a commercial 
database. As multimedia technology matures, data- 
bases must address the need to store and index 
information beyond numbers and characters. 

The work accomplished to support multimedia 
objects in DEC Rdb is just "the tip of the iceberg." 
Current multimedia capabilities are able to success- 
fully manage the majority of document and still 
frame applications. However, improvement in 
capacity and performance are required before the 
database can serve multiple channels of video and 
audio data. As the SQL standard evolves to incorpo- 
rate a more object-oriented mechanism, much of 
the SQL Multimedia functionality will migrate to 
using standard interfaces to define, operate on, and 
query abstract data types. 
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DECspimA Networked 
Desktop Videoconferencing 
Application 

The Sound Picture Information Networks (SPIN) technology that is part of the 
DECspin version 1.0 product takes digitized audio and video from desktop comput- 
ers and distributes this data over a network to form real-time conferences. SPIN uses 
standard local and wide area data networks, adjusting to the various latency and 
bandwidth differences, and does not require a dedicated bandwidth allocation. 
A high-level SPIN protocol was developed to synchronize audio and video data 
and thus alleviate network congestion. SPIN performance on Digital's hardware 
and software platforms results in sound and pictures suitable for canying 
on personal communications over a data network The Society of Technical 
Communication chose the DECspin version 1. 0 application as a first-place recipient 
of the Distinguished Technical Communication Award in 1992. 



In late 1990, we began to design a software product 
that would allow people to see and hear one 
another from their desktop computers. The result- 
ing DECspin version 1.0 application takes digitized 
audio and video data from two to eight desktops 
and distributes this data over a network to form 
real-time conferences. The product name rep- 
resents the four major communication elements 
that unite into one cohesive desktop applica- 
tion, namely, sound, picture, information, and 
networks. The overall technology is referred to as 
SPIN. This paper first presents an introduction to 
conferencing and gives a brief overview of the 
framework on which SPIN was developed. The 
paper then details SPIN'S graphical user interface. 
Although the high-level protocol (which is the 
application layer of the International Organization 
for Standardization/Open Systems Interconnection 
[ISO/OSI] model) that SPIN uses to synchronize 
distributed audio and video is proprietary, a gen- 
eral discussion of how SPIN uses standard data 
networks for conferencing is presented. Perfor- 
mance data for DECspin version 1.0 running on 
a DECstation 5000 Model 200 workstation with 
DECvideo and DECaudio hardware follows the dis- 
cussion of network considerations. Finally, the 
paper summarizes the future direction of desktop 
conferencing. 



Introduction to Conferencing 

When the SPIN project started, standalone telecon- 
ferencing products were available but not for desk- 
top computers. Typically, the products offered 
cost as much as $150,000, required scheduled con- 
ference rooms and operators, and needed leased 
telephone lines. These systems did not operate as 
part of a corporate computer data network but 
instead required dedicated, switched 56-kilobit- 
per-second (kb/s), Tl (1.5-megabit-per-second 
[Mb/s]), and T3 (45-Mb/s) public telephone compo- 
nents in order to operate. Originally designed 
as two-way conference units, these teleconferenc- 
ing products later included hardware to multiplex 
several equally equipped systems. In addition, 
the enhanced systems included custom logic to 
implement a hardware compressor/decompressor 
(codec) that reduced digital video data rates su i- 
ciently to use leased telephone lines. 

During the last several years, other conferencing 
systems have been demonstrated. The Pandora 
research project by Olivetti Research resulted in 
an excellent desk-to-desk conferencing system. 
Although the Pandora system was expensive per 
user and did not use existing network protocols, it 
did prove the viability of using a digital conferenc- 
ing system from one's office and demonstrated the 
natural progression from room conferencing to 
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office conferencing. This system served as a good 
example for our own emerging desktop model, 
DECspin version 1 .0. 

Throughout this same period, several compres- 
sion standards suitable for video capture and 
playback have evolved and been implemented. The 
Joint Photographic Experts Group (JPEG) industry- 
standard algorithm results in intraframe compres- 
sion of frames of high-quality video (on the order of 
25 to l). 1 - 2 This algorithm is well suited for either 
single-frame capture or motion-frame capture of 
video information. This form of compression is 
most appropriate for real-time video capture and 
playback where low (i.e., frame-by-frame) latency 
is required. 

The Motion Picture Experts Group (MPEG) stan- 
dard results in interframe compression of motion 
video. 5 This algorithm is well suited for motion- 
frame capture of video because only the differences 
between successive frames are stored. Interframe 
compression is appropriate for video capture and 
playback where real-time low latency is not 
required. 

The H.261 standard results in interframe com- 
pression of motion video that is most responsive to 
the demands placed on capturing live video for dis- 
semination over low-bandwidth public telephone 
networks. 4 This compression is suitable for video 
capture and playback with reasonable latency but is 
not quite real-time in nature. H.261 is the standard 
used most in the teleconferencing systems on the 
market today. 

Finally, the last few years have also witnessed 
the emergence of dramatic new base computer and 
network technologies. Reduced instruction set 
computer (RJSC)-based workstations supply the 
needed processing power and I/O bandwidth to 
process large and continuous amounts of data, and 
fiber distributed data interface (FDDI) technology 
results in 100-megabit-per-second local area net- 
works for the desktop. Consequently, the SPIN 
development project got under way to provide a 
novel and innovative software application that 
could take advantage of the powerful new systems 
and networks. 

Overview of Underlying 
Hardware and Software 

We came up with the SPIN project in response to 
the question: How can we communicate easily 
with graphics, video, and audio on the desktop 
as well as over both local and wide geographical 



area networks? Video help documentation, textual 
help, and audio help are used on the desktop to 
communicate how the application works. Sound, 
picture, graphics, and network elements are all 
woven together to provide better communication 
among conference participants. 

Early in 1991, we received our first prototype of 
the DECvideo TURBOchannel frame buffer, which 
included the necessary hardware to input and cap- 
ture an analog video signal, to digitize the signal, 
and to display the pixel information on the screen. 
The frame buffer was special in that it displayed 
8-bit pseudocolor, 8-bit gray-scale, and 24-bit true- 
color graphical data simultaneously. This feature 
allowed captured video data to be displayed with- 
out data dithering. 

Dithering is the process of converting each pixel 
of video data to a form that matches a limited 
number of available colormap entries. Most work- 
station frame buffers are 8-bit pseudocolor. Hence, 
digitized, 24-bit true-color video data for display 
would need pixel-by-pixel conversion. Algorithms 
exist that could be used to accomplish this conver- 
sion. However, a better SPIN conference, in terms 
of frame rate and picture quality, was achieved by 
performing no software dithering, thus relying 
on the ability of the DECvideo hardware to display 
24-bit true-color video or 8-bit gray-scale video. 5 In 
addition, the DECvideo hardware could scale clown 
the incoming video image in real time so that fewer 
pixels (i.e., less data) represented the original 
image. 

Concurrently, SPIN used a DECaudio TURBOchannel 
card that could sample an input analog audio signal 
from a microphone and deliver an 8-kilohertz digi- 
tized audio bit stream. The DECaudio hardware 
could also convert a digital audio stream for output 
to an analog speaker or external amplifier. A 
DECstation 5000 Model 200 with DECaudio and 
DECvideo components provided the core hardware 
capability used in SPIN development work. 

In addition to these new hardware capabilities, 
the SPfN effort needed new underlying base soft- 
ware capabilities. The DECvideo hardware required 
the Xv video extension to the X Window System to 
allow for the display and capture of video data. (The 
Xv extension was jointly developed by base system 
graphics and lYilT Project Athena teams.) The 
DECaudio component used the AudioFile audio 
server, developed by Digital's Cambridge Research 
Laboratory, to capture and play back digital audio 
data. 
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A prototype software base was created to make 
fundamental measurements of video and audio data 
manipulation within the workstation and over a 
network. Testing the prototype over a 100-Mb/s 
FDDI network and a 10-Mb/s Ethernet network 
demonstrated that a conferencing product running 
over existing network protocols was possible. 

The SPIN Application 

SPIN is a graphical multimedia communications 
tool that allows two to eight people to sit at their 
desktop computers and communicate both visually 
and audibly over a standard computer data net- 
work. The user interface employs a telephone-like 
"push" model that allows a user to place an audio- 
only video-only, or audio-video call to another 
desktop computer user. Here, the term "push" 
means that SPIN conference participants control all 
aspects of the digitized data they send onto a net- 
work. Thus, users can feel confident about the secu- 
rity of their audio and video information. A caller 
initiates all calls to other users, and a call recipient 
must agree to accept an incoming SPIN call. Because 
all data is in the digital domain, this model makes it 
almost impossible to use SPIN to eavesdrop on 
another person. Placing a wiretap on a person's call 
would involve intercepting network packets, sepa- 
rating data from protocol layers, and then reassem- 
bling data into meaningful information. If the 
network data were encrypted, interception would 
be impossible. SPIN also provides other communi- 
cation services, such as an audio-video answering 
machine, messaging, audio-video file creation, 
audio help, and audio-video documentation. 
Figure 1 shows a screen capture of a SPIN session in 
progress, using the DECspin version 1.0 application. 

The product is easy to learn and to use. The 
graphical user interface is implemented on top of 
Motif software. Motif provides the framework for 
the SPIN international user interface. A model was 
chosen in which all actions taken by a user are 
implemented by push buttons that activate pop-up 
menus. The SPIN application does not use pull- 
down menus, because they require language- 
specific text strings to identify the purpose of an 
entry and thus require translation for different 
countries. Also, pull-down menus are intended for 
short-term interaction, and SPIN menus usually 
require more long-term interaction. All push- 
button icons are pictorial representations of the 
intended function. For example, the main window 
has a row of five push buttons, each of which 



activates a specific function of the application and 
is shown in Figure 1. 

In the main window, the first button from the left 
contains a green circle with a vertical white bar, the 
international symbol for exit. This button appears 
in the same location in each of the pop-up win- 
dows. It is used to exit the window or, in the main 
window, to exit the application. 

The second button from the left is labeled with 
the communication icon. This button is used to 
select the call list shown in Figure 2. The call list 
contains the various buttons and widgets used to 
place a call to another user, to create and play back 
SPIN files, and to display a list of received SPIN mes- 
sages, if any exist. The list provides a way to play 
and manage audio-video answering machine mes- 
sages. For example, to place a call to another user 
on the network requires just three steps. 

1. Enter the computer network name of the 
machine and user into SPIN'S phone database as 
"user@desktop." A string representing some- 
thing more understandable to a novice is also 
allowed, e.g., "user@desktopl.dec.com" becomes 
"user@desktopl.dec.com Firstname Lastname at 
Digital Equipment Corporation." 

2 Select whether the call is to be sound only, 
picture only, or both. The toggle push buttons 
under the large note icon control audio select; 
those under the large eye icon control video 
select. Once the call is established, these but- 
tons can be set or unset by clicking a mouse or 
using a touch-screen monitor and are useful 
for muting the audio portion or freezing frames 
of the video portion. 

3. To establish a two-way network connection, 
press the call push button under the connection 
icon (which is labeled with two arrows going in 
opposite directions) that appears next to the 
desired call recipient. If the person called is 
logged on, a ring dialog box appears on the 
call recipient's screen and a bell rings. If the call 
recipient is not available, a dialog box appears on 
the caller's screen asking whether the caller 
wishes to leave a message. The caller can then 
choose to leave a message or not. 

Depending on the individual settings, users can 
see and hear one another in multiple windows 
on the screen. To connect all conference partici- 
pants in a mesh, press the "join" push button, 
which has a triangular icon. 
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Returning to the main window, the middle push 
button is the SPIN control button. As shown in 
Figure 3, the SHIN control pop-up window contains 
slide bars that, from top to bottom, allow the caller 
to set maximum capture frame rate, hue, color sat- 
uration, brightness, contrast, speaker output vol- 
ume, and microphone pickup gain. At the bottom 
of the control window are buttons for selecting 
compression and rendering. 

To the right of the control button in the main 
window is the status icon button. Pressing this but- 
ton causes the status pop-up window shown in 
Figure 4 to appear. The status window displays, 
below the camera icon, the active size of the cap- 
tured video area in pixels. Beneath these dimen- 
sions is a vertical slide bar that indicates the average 
frames-per-second (frames/s) capture rate sampled 
over a five-second interval. To the right of the 
camera icon is the connection icon, under which 
appears the number of active connections. Below 
this number are the sound and picture icons, under 



which appear the number of active audio connec- 
tions and the number of active video connections, 
respectively. The second slide bar indicates the 
result of sampling the average outgoing bandwidth 
consumption (measured in Mb/s) of the application 
on the network. This measurement is also updated 
every five seconds. 

Finally, the fifth push button (on the far right) in 
the main window is the information button. By 
pressing this button and selecting the type of on- 
line information desired, the user can access the 
documentation pop-up windows, as illustrated in 
Figure 5. Within each documentation window are 
several topics and two columns of toggle push but- 
tons that can be used to obtain either textual docu- 
mentation or video documentation. The video 
documentation comprises short videos that 
contain expert help about the operation of the 
application. 

As a final level of help, all push buttons and wid- 
gets within the application have associated audio 
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Figure 3 SPIN Cont rol Pop - up Window 



tracks that tell the user what the buttons and 
widgets do within their context in the application. 
To activate the audio tracks, the user must first 
select the button or widget and then press the Help 
key on the keyboard. 

Network Considerations 

SPIN uses standard data networks to transport the 
information that composes a conference. Data net- 
works are usually private networks that a user com- 
munity maintains. Such networks often include a 
number of individual networks joined together by 
bridges and routers. Unlike public telephone net- 
works, which are most frequently used for phone 
calls, private networks are used for a variety of 
computer data needs, including file transfers, 
remote logins, and remote file systems. However, 



telephone networks often provide the long- 
distance lines used to make up private wide area 
data networks. 

The use of data networks allows conferencing 
data to be treated as would any other type of data. 
SPIN requires no special low-level networking pro- 
tocols to transmit its data and uses the transmission 
control protocol/internet protocol (TCP/IP) or the 
DECnet protocol. Also, SPIN requires no changes to 
existing operating systems. When performing the 
prototype work for the SPIN application, we were 
not certain whether the real-time nature of confer- 
encing could be accomplished on inherently 
non-real-time networks and operating systems. 
Consequently, we developed a special high- layer 
synchronization conferencing protocol, called the 
SPIN protocol, that uses existing data networks. 
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This protocol is responsible for the synchronization 
of audio and video information. The SPIN protocol 
monitors the flow of data to the network in order 
to alleviate network congestion when detected. As 
the network becomes congested, the protocol 
makes the decision to withhold further video data, 
since video is the largest consumer of network 
bandwidth. This withholding of video data is a key 
feature of the SPIN protocol, because it allows a 
conference to vary the video frame rate on a user- 
by-user basis. Thus, video bandwidth can scale to 
the lesser of either the bandwidth available or the 
number of frames/s of video bandwidth that a given 
platform can sustain. 

If the withholding of video corrects the network 
congestion, video data is once again allowed in the 
conference. If not, the SPIN protocol delays audio 
data and stores it in a buffer until the network is 
able to handle this data. If the network outage lasts 
approximately 10 seconds, audio data is lost. 
Periods of audio silence are used as a means of 
recovery from periods of network congestion. 



Thus, variable video frame rates along with this 
treatment of audio data allow for the graceful degra- 
dation of a conference as the network becomes 
busy. 

SPIN has been demonstrated over a variety of 
public and private data networks including 
Ethernet (10 Mb/s), FDDI (100 Mb/s), Tl (1.5 Mb/s), 
T3 (45 Mb/s), cable television (10 Mb/s, more cor- 
rectly, Ethernet running over two 6-megahertz 
cable television channels), switched multimegabit 
data service (SMDS) (1.5 or 45 Mb/s), asynchronous 
transfer mode (ATM) (150 Mb/s), and frame relay 
(1.5 or 45 Mb/s). Some of these networks are local 
or metropolitan area technologies, i.e., local area 
networks (LANs), whereas others are wide area 
technologies, i.e., wide area networks (WANs), as 
illustrated in Figure 6. 

Each type of network provides SPIN with differ- 
ent latency and bandwidth characteristics. SPIN 
makes corresponding adjustments to a conference 
to account for these differences and does not 
require a dedicated bandwidth allocation to carry 
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on a conference. If a given network supports band- 
width allocation, this feature only enhances SPIN'S 
ability to deliver video and audio information. 

WANs may use a router to interconnect two or 
more LANs. SPIN has been tested on a number of 
routers with mixed results, i.e., some routers cor- 
rectly handle SPIN'S bidirectional traffic pattern 
whereas others do not. Since some routers clo not 
correctly handle bidirectional data traffic without 
packet loss, wide area routers must be individually 
tested with SPIN to verify proper operation, Some 
router problems were traced to the use of old 
firmware or software. Consequently, SPIN acted 
like a diagnostic tool in pointing out these prob- 
lems. For example, running the SPIN application 
with atidio only, across Digital's private IP network, 
yields varied results. Digital s IP network is an exam- 
ple of an open network, with routers from most 
router vendors. We traced most instances of poor 
SPIN performance to old or obsolete rotifers (some 
in service for the last six years without upgrades). 
These routers usually dropped packets when rout- 
ing between adjacent Ethernet networks that were 
only 10 percent busy. After these routers were 



upgraded to the DECNiS family of routers, the SPIN 
application functioned correctly, even on con- 
gested networks. 

To demonstrate daily tise of SPIN, we created a 
metropolitan area network (MAN). Figure 7 shows 
the network topology, which spanned the states of 
New Hampshire and Massachusetts. The test bed 
allowed tis to demonstrate our FDDI products, 
including end-station FDDI adapter cards, multi- 
mode FDDI wiring concentrators, and single-mode 
FDDI wiring concentrators. SPIN was tised in 30 
workstations, two of which were attached to large- 
screen projection tin its in conference rooms. 

Performance 

The conference quality achieved when running the 
SPIN application depends on many factors. The 
available network bandwidth, the processor speed, 
the desired frame- rate specification, the compres- 
sion setting, the picture size, and how the pictures 
are rendered all affect the quality of the conference. 
Table 1 contains performance data for DECspin ver- 
sion 1.0 at various combinations of settings for 
these factors. 
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Table 1 SPIN Performance on a DECstation 5000 Model 200 with 
DECvideo and DECaudio Hardware 
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As shown in Table 1, we tested SPIN performance 
using two basic picture sizes: 256 by 192 pixels and 
160 by 120 pixels. The tests were performed over 
both Ethernet and FDDI networks for black-and- 
white and color cases. Also noted in the table is 
whether or not software compression was enabled 
for a specific test case. The far right column shows 
the frame rate achieved for the different combina- 
tions and also summarizes the network bandwidth 
consumed in each test. The table is presented pri- 
marily to give a sampling of the frame rate and, 
hence, the level of visual quality achieved for a spe- 
cific combination of parameters. Frame rates affect 
an observers ability to detect change within a 
sequence of frames. With a slow frame rate, the 
resulting video sequence may appear choppy and 
incomplete, whereas a normal frame rate (24 to 30 
frames/s) leads to a smoothly varying video 
sequence with even continuity from one sequence 
to another. The frame rates in Table 1 below about 6 
to 7 frames/s are considered low quality. Those in 
the 8-to-19-frames/s range are considered good 
quality, and those in the 20-to-30-frames/s range 



are high-quality video. The best cases in Table 1 are 
those that used software compression to deliver a 
pleasing frame rate with the least amount of net- 
work bandwidth consumed together with some 
degradation of individual frame quality. The soft- 
ware compression was tuned to provide nearly the 
same frame quality as the uncompressed case. 

Table 1 also shows performance data measured 
using a DECNIS router. As noted earlier, wide area 
usage of SPIN depends on a router with correct algo- 
rithms for handling of bidirectional continuous 
stream traffic. The DECNIS family of routers can 
supply the full Tl bandwidth when presented with 
bidirectional SPIN traffic. Other routers on which 
SPIN was tested typically delivered only 25 to 50 
percent of the Tl bandwidth. Note that this was 
only true on the particular routers we tested and 
that routers other than DECNIS routers may also be 
able to deliver full Tl bandwidth for this particular 
traffic pattern. 

Hardware compression technology mentioned in 
the section Overview of Underlying Hardware and 
Software reduces the bandwidth requirements for 
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conferencing. Experimentation with motion JPEG 
compression (using the Xv extension with com- 
pression functions on an Xvideo frame buffer 
board) has shown that at a resolution of 320 by 240 
pixels, true-color frames can be used at 15 to 20 
frames/s at a bit rate of just under 1.0 Mb/s. This bit 
rate produces a good- to high-quality conference 
with very low latency. H.261 and MPEG technology 
result in similar frame rates and picture size at 
about one-half the bandwidth but higher overall 
latency. Using motion JPEG as the example, high- 
quality conferences require about 1 Mb/s per 
connection. If all conferences are to be high qual- 
ity, this bit rate allows 1 two-party conference 
on a T I connection, 5 two-party conferences on an 
Ethernet segment, and 50 two-party conferences 
on an FDD I network. Using GIGASW1TCH FDD! 
switches, more than 500 two-party conferences 
can take place simultaneously on a network. More 
users could be supported on Tl, Ethernet, or 
G1GASWITCH networks, if lower-quality confer- 
ences are acceptable. 

Conclusion 

It became clear during the development and 
deployment of SPIN that high cost per user limits 
the widespread use of the application. The cost of 
video for DECspin version 1.0 adds about $8,000 to 
the price of a workstation. Audio for version 1.0 
adds about $2,000 per workstation. These costs, 
which are prohibitive to most potential users of 
the technology, do not include the network cost 
impact. 

Digital's Alpha AXP family of computers come 
with audio input and output hardware as part of the 
base workstation. In spring 1993, Digital released to 
the Internet community a version of DECspin that 
uses this hardware to carry on audio-only confer- 
ences and shows the user a voice waveform instead 
of a video image. This version eliminates the add-on 
hardware cost for audioconferencing. A new low- 
cost video option would go far to reduce the add-on 
cost for video and facilitate a wider use of the SPIN 
application. 

The SPIN application and its associated protocol 
have been demonstrated on Digital and non-Digital 
computers, operating systems, and networks. In 
particular, SPIN has been shown on SPARC worksta- 
tions running Solaris software. Additionally, SPIN 
has been demonstrated on a personal computer 
using the Microsoft Multimedia Extensions (MME) 
to Windows software. This platform provides a 



very large user community of potential SPIN users 
and dramatically drops the price per user compared 
with the original product. Interoperability among 
platforms and a common user interface give Digital 
a leadership position in this fast-forming market. 

Today, high-quality conferencing can scale to 
hundreds of seats on a LAN with lower-quality con- 
ferencing scaling to larger, more geographically dis- 
persed networks. Several factors will lead to the 
widespread use of this technology: better and less- 
expensive hardware, programmable codecs, and 
higher-speed and less-costly cross-country net- 
works. Less-expensive video hardware allows many 
users to upgrade their systems to include video, 
while programmable compression technology 
allows users to achieve improvements in picture 
quality, compression transcoding, and lower net- 
work needs. Higher-capacity and less-costly cross- 
country networks allow more users to access 
conferencing services. Ultimately, even homes will 
have better computer connectivity and bandwidth. 
As these changes occur, and we believe they will, 
desktop conferencing can become the interactive 
telephone of the twenty-first century. 
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LAN Addressing for 
Digital Video Data 

Multicast addressing was chosen over the broadcast address and unicast address 
mechanisms for the transmission of video data over the LAN. Dynamic allocation of 
multicast addresses enables such features as the continuous playback of full 
motion video over a network with multiple vieivers. Design of this video data trans- 
mission system permits interested nodes on a LAN to dynamically allocate a single 
multicast address from a pool of multicast addresses. When the allocated address is 
no longer needed, it is returned to the pool. This mechanism permits nodes to use 
fewer multicast addresses than are required in a traditional scheme where a 
unique address is allocated for each possible function. 



The transmission of digital video data over a local 
area data network (LAN) poses some particular 
challenges when multiple stations are viewing the 
material simultaneously This paper describes the 
available addressing mechanisms in popular LANs 
and how they alleviate problems associated with 
multiple viewing. It also describes a general mecha- 
nism by which nodes on a LAN can dynamically allo- 
cate a single multicast address from a pool of 
multicast addresses, and subsequently use that 
address for transmitting a digital video program to a 
set of interested viewers. 

Project Goals 

The objective of this project was to design a mecha- 
nism suitable for providing the equivalent of broad- 
cast television using computers and a local area 
data network in place of broadcast stations, air- 
waves, and televisions. The resulting system had to 
provide access to broadcast, closed circuit, and on- 
demand video programs throughout an enterprise 



using its computers and data network. The use of 
computer equipment installed for data transmis- 
sion would eliminate the need to invest in cable TV 
wiring throughout a building. 

The basic system would consist of two primary 
components. One computer, or set of computers, 
would act as a video server by transmitting video 
program material, in digital form, onto the data net- 
work. Other computers, acting as clients, would 
receive the transmitted video program and present 
it on the computer's display. Figure 1 depicts such a 
configuration. 

Thevariety of video source material suggests that 
servers may be equipped in several ways. For exam- 
ple, accessory hardware can receive broadcast 
video programs; hardware and software can con- 
vert analog video into digital format; and hardware 
and software can compress the digital video for effi- 
cient use on a personal computer and data net- 
work. 123 Figure 2 shows a server equipped to 
handle different types of video program sources. 
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Figure 1 Client-server System for Video Data Transmission 
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Figure 2 Types of Video Program Sources 



Video program material is categorized as live, 
e.g., the current program broadcasting on a televi- 
sion network, or stored and played on demand, 
e.g., a recorded training session. In both cases, it is 
desirable for more than one client to be able to 
monitor or view the transmitted video program. 

To implement the client-server system described 
above, many technical hurdles had to be overcome. 
This paper, however, focuses on one narrow but 
critical aspect, the addressing method used on the 
TAN for delivery of the digital video data. The char- 
acteristics of digital video and the need for multiple 
stations to receive programs from a wide range of 
possible sources combined to create some interest- 
ing challenges in devising a suitable addressing 
method. 

Choosing an Addressing Method 

To transmit digital video over a data network, an 
effective addressing mechanism had to be chosen 
that would satisfy the project's goals. Most LANs 
support three basic data addressing mechanisms: 
broadcast, unicast, and multicast. - '-^- 7 Each method 
of transmitting digital video over a LAN has charac- 
teristics that are both attractive and undesirable. 

Broadcast addressing uses a special reserved des- 
tination address. By convention, data sent to this 
address is received by all nodes on the IAN. 
Transmitting digital video data to the broadcast 
address serves the purpose of permitting multiple 
clients to receive the same transmitted video pro- 
gram while permitting the server to transmit the 
data once to a single address. Viewed another way, 
this convention is a significant disadvantage 



because all stations receive the data whether they 
are interested or not. Compressed digital video rep- 
resents from 1 to 2 megabits per second of data; 
therefore nodes not expecting to receive the video 
data are impacted by its unsolicited arrival. 1 '* As a 
further complication, when two or more video pro- 
grams are playing simultaneously, stations receive 1 
to 2 megabits per second or more of data for each 
video program. This renders many systems inoper- 
ative. Furthermore, LAN bridges pass broadcast 
messages between LAN segments and cannot con- 
fine digital video data to a LAN segment.* As a result 
of these drawbacks, use of the broadcast address is 
unsuitable for transmission of digital video data. 

Unicast addressing sends data to one unique 
node. The use of unicast addressing eliminates the 
problems encountered with broadcast addressing 
by confining receipt of the digital video data to a 
single node. This approach works quite well as long 
as only one node wishes to view the video program. 
If multiple clients wish to view the same program, 
then the server has to retransmit the data for each 
participating client. As the number of viewing 
clients increases, this approach quickly exhausts 
the server's capacity and congests the LAN. Because 
unicast addressing cannot practically support one 
server in conjunction with multiple clients, it too is 
unsuitable for transmission of digital video data. 

Multicast addressing uses addresses designated 
to simultaneously address a group of nodes on a 
LAN. Nodes wishing to be part of the addressed 
group enable receipt of data addressed to the multi- 
cast address. This characteristic makes multicast 
addressing the ideal match for the simultaneous 
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transmission of digital video data to multiple client 
nodes without sending it to uninterested nodes. 
Furthermore, many network adapters provide 
hardware-based filtering of multicast addresses, 
which permits high-performance rejection/ 
selection of data based on the destination multicast 
address. 0 Because of these advantages, multicast 
addressing was selected as the mechanism for trans- 
mission of digital video data. 

Multicast Addressing Considerations 

Together with its advantages, multicast addressing 
brought significant problems to be overcome. The 
problems were in the assignment of multicast 
addresses to groups of nodes, all of which are inter- 
ested in the same video program. If a single multi- 
cast address were assigned for all stations 
interested in receiving any video program, then 
only interested stations would receive data. All par- 
ticipating stations, however, would receive all pro- 
grams playing at any given time. If multiple 
programs were playing, each station would receive 
data for all programs even though it is interested in 
the data for only one of the programs. The obvious 
solution is to allocate a unique multicast address for 
each possible program. The following sections 
examine various allocation methods. 

Traditional Address Allocation 
Traditionally, a standards committee allocates mul- 
ticast addresses, each of which serves a specific 
purpose or function. For example, a specific multi- 
cast address is allocated for Ethernet end-station 
hello messages, and another is allocated for fiber 
distributed data interface (FDDI) status reporting 
frames. lf ,M2 Each address serves one explicit func- 
tion. This static allocation breaks down when a 
large number of uses for multicast addresses fall 
into one category. 

It clearly is not possible to allocate a unique 
multicast address for all possible video programs 
for several reasons. At any given time, hundreds 
of broadcast programs are playing throughout 
the world, and thousands of video programs 
and clips are stored in video libraries. Countless 
more are being created every minute. Assigning a 
unique address to each possible video program 
would exhaust the number of available addresses 
and be impossible to administer. Furthermore, 
it would waste multicast addresses since only 
those programs currently playing on a given 
LAN (or extended LAN) need an assigned address. 



A technique, therefore, is needed by which a block 
of multicast addresses is permanently allocated for 
the purpose of transmitting video programs on a 
computer network, and individual addresses are 
dynamically allocated from that block for the dura- 
tion of a particular video program. 

Dynamic Allocation Method 
A dynamic allocation method should have several 
characteristics to transmit video programs on a 
LAN. These desired characteristics 

1. Must be consistent with current allocation pro- 
cedures used by standards bodies like the IEEE 

2. Should be fully distributed and not require a 
central database (improves reliability) 

3. Must support multiple clients and multiple 
servers 

4. Must operate correctly in the face of LAN per- 
turbations like segmentation, merging, server 
failure, and client failure 

It is clearly desirable to use a dynamic allocation 
mechanism that does not require changes to the 
way addresses are allocated by standards commit- 
tees. Changes to protocols only create another level 
of administrative complexity. Instead, a single set of 
addresses should be allocated on a permanent basis 
for use in the desired application. Drawn from a 
pool of addresses, these allocated addresses could 
be dynamically assigned to video programs as they 
are requested for playback. When playback was 
complete, the address would be returned to the 
pool. 

Regardless of which allocation mechanism is 
adopted, it needs to support multiple servers and 
multiple clients. This implies that some form of 
cooperation exists between the servers to prevent 
multiple servers from allocating the same address 
for two different video programs. One node could 
act as a central clearinghouse for the allocation of 
addresses from the pool, but the overall operation 
of the system would then be susceptible to failure 
of that node. The preferred approach is a fully dis- 
tributed mechanism that does not require a central- 
ized database or clearinghouse. 

LANs tend to be constantly changing their config- 
urations, and nodes can enter and leave a network 
at any time. As a result, an allocation mechanism 
must be able to withstand common and uncommon 
perturbations in the LAN. It must accommodate 
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events such as the segmentation of a LAN into two 
LANs when a bridge becomes inactive or discon- 
nected, joining of two LANs into one when a bridge 
is installed or becomes reactivated, and failure or 
disconnection from the LAN at any time by both 
server and client nodes. 

Other Multicast Allocation Methods 
A variety of different group resource allocation 
mechanisms exist, and the one most nearly applica- 
ble to transmitting digital video over a LAN is used 
in the internet protocol (IP) suite. Deering dis- 
cusses extensions to the internet protocols to sup- 
port multicast delivery of internet data grams. 
In his proposal, multicast address selection is algo- 
rithmically derived from the multicast IP address 
and yields a many-to-one mapping of multicast 
IP addresses to LAN multicast address. As a conse- 
quence, there is no assurance that any given multi- 
cast address will be allocated solely for the use of 
a single digital video transmission. This undermines 
the goal of using multicast addressing to direct the 
heavy flow of data to only those stations wishing to 
receive the data. Deering discusses the need for 
allocation of transient £roup address and alludes to 
the concepts presented in this paper. 

Model for Dynamically Allocating 
Multicast Addresses 

Given the overall goals of the project and the 
desired characteristics of the application, the fol- 
lowing model was developed. It transmits digital 
video on a data network using dynamically allo- 
cated multicast addresses. First, simple operational 
cases on the LAN are described. Then complicated 
scenarios dealing with network misoperations are 
addressed. 

It should be noted that the protocols described 
address the location of video program material as 
well as the allocation of multicast addresses for 
delivery of that material. Because of the one-to-one 
correspondence between video material and 
address allocation, it is convenient to combine 
these two functions into a single protocol; how- 
ever, the focus of this paper remains on the address 
allocation aspects of the protocol. 

Mul ticast Address Pool 

This model assumes a set of n multicast addresses 
permanently allocated and devoted to it. The 
addresses are obtained through the normal process 



for allocation of multicast addresses through the 
IEEE. All clients and servers participating in this 
protocol use the same set of addresses. For the sake 
of this discussion, these addresses are denoted as 
Al, A2,...A«. Address Al is always used by the par- 
ticipating stations for exchange of information nec- 
essary to control the allocation of the remaining 
addresses for use by the participating stations. The 
remaining addresses A2 through An form the pool 
of available multicast addresses. 

Server Announcements 

All servers capable of transmitting digital video 
data continuously announce their presence and 
capabilities by transmitting a message at a predeter- 
mined interval; for example, a message is addressed 
to Al every second. In these announcements, the 
servers include information identifying their gen- 
eral capabilities, data streams they are currently 
transmitting, and data streams they are capable of 
transmitting. 

A server's general capabilities include its name 
and network adclress(es). Other useful information 
can also be announced, but it is not relevant to 
this discussion. To identify the data streams cur- 
rently being transmitted, the server describes 
the data and the multicast address to which each 
data stream is being transmitted. In this way, it 
announces those multicast addresses that the sta- 
tion is currently using, along with a description of 
the associated video program. The data streams the 
server is capable of transmitting are identified by 
some form of a description of the data stream. 

Identifying Servers and 
Available Programs 

With each server continuously announcing the pro- 
gram material available for playback, clients wish- 
ing to receive a particular data stream can monitor 
the server announcements being sent to address 
A 1. By receiving these announcements, a client can 
ascertain the address of each server active on the 
LAN, the data streams currently being transmitted 
by each server and the multicast address to which 
each is being transmitted, and the data streams 
available for transmission. 

With a large repository of program material, 
it could easily become impractical to announce 
all available material. In this case, the announce- 
ments could be used only to locate available 
servers, and an inquiry protocol or database search 
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mechanism could be used to locate available mate- 
rial more efficiently. 

Once a client identifies a server that is offering 
the desired data stream, it can request that the 
server begin transmission. The client sends a mes- 
sage identifying the desired playback program 
material. In response, the server allocates a unique 
multicast address, includes the new material and 
multicast address in its announcement messages, 
and begins transmitting the program material. 

Address Allocation and Tracking 
Each server maintains a table containing the usage 
of each of the A2 to An addresses. Each address is 
tagged as either currently used or available for use. 
When a server receives a client's request for trans- 
mission of a new data stream, the server selects a 
currently unused multicast address and includes 
the address and data stream description in its 
announcements of data streams currently being 
transmitted. After sending two announcements, 
the server begins transmitting the data to the cho- 
sen multicast address. Sending two announcements 
before beginning transmission provides client 
nodes with ample time to ascertain the address to 
which the data will be sent and to enable reception 
of the video program. 

In addition to sending announcement messages, 
the servers also listen to the announcements from 
other servers to keep track of all multicast 
addresses currently in use on the LAN. Each time a 
server receives an announcement message from 
another server, it notes the addresses being used 
and marks them all as used in its table. This pre- 
vents a server from allocating an address already 
used by another server and eliminates the need for 
a central database or clearinghouse. 

If a server observes that it is using the same 
address as another server, then the server moves 
its data transmission to another address if and only 
if its node address is numerically lower than the 
other server's node address. The new address is 
allocated exactly as it would be if the server were 
beginning to transmit the data stream for the first 
time. This algorithm resolves conflicts where two 
or more servers choose the same available multi- 
cast address at the same time. In addition, it 
resolves a similar conflict that occurs when two 
separate LAN segments become joined and two 
servers suddenly find they are using the same multid 
cast address. 



Clashing allocations of multicast addresses can be 
held to a minimum if servers allocate an address at 
random from the remaining pool of addresses rather 
than all servers allocating in the same fixed order. 

Identifying and Stopping Playback 
After a client requests playback of new material, it 
can then examine the server's announcements, and 
when the desired data stream appears as being 
transmitted by the server, the client can begin 
receiving data from the advertised multicast 
address. At this point, any other client stations on 
the LAN can also receive the same video program by 
enabling receipt of the same address. 

When no more clients wish to view a partic- 
ular program, a mechanism is needed to inform 
a server to stop transmission and return the asso- 
ciated address to the free pool. Two alternative 
approaches were considered to stop playback; one 
was chosen for several reasons. 

In the first approach, each server tracks the num- 
ber of clients that have requested a particular pro- 
gram by simply counting the number of requests 
for that program. In addition, clients are required to 
notify the server when they are finished viewing. 
The server then continues to transmit the material 
until all interested clients have indicated they are 
no longer interested in viewing. This approach has 
two problems. If a viewing client node is reset or 
disconnected, or if its message to end viewing is 
lost, the server could lose track of the number of 
viewing clients and never stop playing a particular 
program. The second problem, which is more of a 
nuisance, is that clients have to request playback of 
a program even if it is already playing to enable the 
servers to track the number of viewers. 

In the preferred approach, interested clients 
periodically remind the server that they wish to 
continue viewing the program. Servers then simply 
keep playing the material until no client expresses 
interest for some period of time. For example, 
clients could reiterate their interest in a program 
every second, and a server could continue transmit- 
ting a requested program until it did not receive a 
reminder for 3 seconds. This time lapse would 
accommodate lost reminder messages from clients, 
and client failure would result in transmission ter- 
mination within 3 seconds. In addition, when all 
clients had finished viewing the material, the 
server, multicast address, and consumed network 
bandwidth would be released within 3 seconds, 
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making them available for other uses. Selection of 
the actual timer value depends on the desired bal- 
ance between ongoing consumption of network 
resources (bandwidth and multicast addresses) 
after all receiving parties have stopped viewing the 
data, and network, end system, and server resource 
consumption caused by more frequent reminder 
messages. 

Changing Multicast Addresses 
Aside from receiving and processing the data for a 
video program, client stations must also continue 
to examine the server announcement messages and 
remain alert to possible changes in the multicast 
address to which the received program is being 
transmitted. As noted above, address allocation can 
change at any time due to merging of LAN segments 
or duplicate allocation by two servers. Anytime a 
client notes a change in address, it must stop receiv- 
ing data on the previous address and resume receiv- 
ing with the new address. A momentary disruption 
in playback is likely to occur, but such disturbances 
are infrequent because only merging LANs cause 
duplicate allocations of addresses in the middle of 
playback. 

Under the circumstances described earlier, a 
client can find itself receiving two data streams on 
the same multicast address for some finite time 
period until the servers resolve the allocation of 
that address. Clients can gain immunity to this situ- 
ation by noting the source address of the server that 
originally provided the data stream, and discarding 
all data received on the multicast address that is not 
from the source address. With this improvement, 
clients can easily distinguish the data stream of 
interest from another which might momentarily 
appear addressed to the same multicast address. 

The allocation and resolution of multicast 
address use can be improved if servers send their 
announcements at an increased rate for some time 
period after a new data stream begins transmitting 
or when a data stream changes address. Such accel- 
erated announcements permit client stations to 
more quickly identify the address of a requested 
data stream, and more quickly identify when a data 
stream has moved from one address to another. 
They also permit servers to more quickly identify 
instances of clashing multicast addresses and 
resolve them. For example, the announcement 
interval could be increased from 1 second to one- 
quarter second for a 2-second duration and 
resumed at 1-second intervals. 



Extension to Interconnected LANs 
The described protocols and allocation methods 
function correctly across multiple LANs intercon- 
nected by bridges since bridges nominally forward 
multicast traffic. Many bridge implementations per- 
mit management control over the forwarding of 
multicast data. This can unintentionally interfere 
with the desired operation of this protocol, but 
it can also as serve as a useful tool to confine data 
traffic to particular LAN segments. Another prac- 
tical consideration in the particular application 
described here is the ability of a bridge to forward 
the large amounts of data traffic involved in digital 
video without detrimentally impacting the time- 
dependent nature of the data. 

Extending the protocols to a wide area network 
is a more difficult procedure. Routers do not for- 
ward multicast traffic, but they could if used as 
proxy nodes between LANs. Router forwarding 
performance tends to be even lower than bridge 
forwarding rates, which discourages the operation 
of this system over a router. 

Conclusions 

Dynamic allocation of multicast addresses is criti- 
cal to enable features such as the continuous play of 
full motion video over a network with multiple 
viewers. It is not feasible (or at least is very difficul t) 
for a server to transmit a data stream individually 
to all clients wishing to receive it. If, on the other 
hand, the desired data stream is transmitted to the 
broadcast address, all stations on the LAN have to 
receive an enormous volume of data whether they 
are interested or not. It is highly desirable not 
to inundate uninterested clients with video data 
streams, but to send them to clients that want to 
receive specific video data streams in which they 
are interested. 

Multicast addresses are well suited (in fact 
designed) for transmission to some arbitrary group 
of stations. To prevent a client that is receiving one 
video stream from being inundated by other video 
streams, a unique multicast address is required 
for each unique data stream. Since there are infi- 
nite individual data streams to choose from, it is 
impossible to allocate a unique multicast address 
for every data stream. A mechanism to allocate 
a unique multicast address from a finite set of 
addresses for the duration of the data stream is the 
ideal choice. The described mechanism also has the 
attractive characteristic that it is completely dis- 
tributed; there is no central agent for allocation of 
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multicast addresses; therefore it is more reliable as 
servers join and leave the LAN. 

Although transmission of digital video data has 
prompted this system design, the basic mechanism 
for dynamically allocating multicast addresses can 
be applied to any application with similar needs. 
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CASE Integration Using 
AC A Services 

Digital uses the object-oriented software Application Control Architecture (AC A) 
Services to address the problems associated with data access, interapplication com- 
munication, and work flow in a distributed, multivendor CASE environment. The 
modeling of applications, data, and operations in ACA Services provides the foun- 
dation on which to build a CASE environment ACA Services enables the seamless 
integration of CASE applications ranging from compilers to analysis and design 
tools. ACA Services is Digital's implementation of the Object Management Group's 
(OMG) Common Object Request Broker Architecture (CORBA) specification. 



Based on work accomplished in many computer- 
aided software engineering (CASE) projects, this 
paper describes how Pigital's object-oriented 
Application Control Architecture (ACA) Services 
can be used to construct a CASE environment. The 
paper begins with an overview of the types of CASE 
environments currently available. It describes the 
object-oriented technique of modeling applica- 
tions, data, and operations and then proceeds to 
discuss design and implementation problems that 
might be encountered during the integration pro- 
cess. The paper concludes with a discussion of 
environment management. 

CASE Environment Description 

Today's CASE environments are required to operate 
in network environments that consist of geographi- 
cally distributed hardware manufactured by multi- 
ple vendors. In such environments, access to data, 
metadata, and the functions that operate on this 
data must be as seamless as possible. This can be 
accomplished only when well-architected proto- 
cols exist for the exchange of information and con- 
trol. These protocols need not be defined at the 
level of network packets, but rather as operations 
that have well-defined, platform-independent inter- 
faces to predictable behaviors. 

In addition to utilizing the various applications, 
environments deal with how applications are orga- 
nized or grouped within a project and how work 
flows between applications and within the environ- 
ment as a whole. These concepts are discussed 
later in the paper as are the different styles of inte- 
gration that an application can employ. 



•ata integration, i.e., information sharing, is vital 
to any CASE environment because it reduces the 
amount of information users must enter. However, 
data integration must be accompanied by a mecha- 
nism that allows control to pass from one applica- 
tion to another. This mechanism, commonly called 
control integration, provides a means by which 
the appropriate application can be started and 
requested to perform an operation on a piece of 
information. Control integration is also used to 
exchange information between cooperating appli- 
cations, regardless of their geographic locations. 
These two integration mechanisms used in tandem 
can solve many of the problems presented by a dis- 
tributed, multivendor CASE environment. 

ACA Services is Pigital's implementation of the 
Object Management Group s (OMG) Common Object 
Request Broker Architecture (COR1A) specification. 
ACA Services is designed to solve problems asso- 
ciated with application interaction and remote data 
access in distributed, multivendor environments 
such as the CASE environments just described. This 
support includes the remote invocation of applica- 
tions and components without the need for multi- 
ple logins or the use of terminal emulators. The 
encapsulation features of ACA Services allow the 
use of applications not designed for distributed 
environments. ACA Services can also be configured, 
in a way transparent to the application, for use on a 
local host. 

The central focus of a CASE environment is on 
how easily functions such as compiling, building, 
and diagramming can be performed. The functions 
available form the foundation on which the 
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environment is constructed. Therefore, the first 
step in the design of a CASE environment is to deter- 
mine what functions to offer. The applications cur- 
rently available to support these functions may be 
integrated using one of two paradigms: application- 
oriented or data-oriented. 

Application-oriented Paradigm 
CASE environments that follow the application- 
oriented paradigm focus on standalone applica- 
tions used to develop software such as editors, 
compilers, and version managers. Application- 
oriented environments normally comprise a col- 
lection of applications that support the necessary 
functions. In application-oriented environments, 
integration tends to be focused on direct communi- 
cation between two different applications. In this 
paradigm, the requesting application knows which 
class of application can be used to satisfy a par- 
ticular request. Environments that present an 
application-oriented paradigm to the user require 
the user to have knowledge of the applications that 
can be used to perform specific tasks. 

As the level of task complexity increases, it 
becomes increasingly important to build environ- 
ments that utilize a paradigm focused on the data 
associated with the task being done and not on the 
applications used to perform the task. The realiza- 
tion of this problem has brought about the exis- 
tence of data-centered environments. 

Data-oriented Paradigm 
CASE environments that use a data-oriented para- 
digm are centered around the data associated with 
the task the user is performing. To accomplish 
a task in such environments, operations are per- 
formed on a data object. Using the object being 
addressed, the operation, and preferences supplied 
by the user, the environment determines which 
application will be used to perform the requested 
operation. Thus, the requesting application requires 
no knowledge about which application implements 
an operation. This paradigm is extremely useful in 
CASE environments because of the diversity of 
objects and range of applications available to per- 
form certain operations. 

The application and the data paradigms can 
coexist in a single CASE environment, and in fact, 
tightly integrated CASE environments exploit the 
strengths of each paradigm. A text editor can be 
used to illustrate this point. Typically, when the 
contents of a source file need to be modified, an 



edit operation is sent to the object representing the 
file. However, a debugger may also use the same 
editor to display source code. The operation to 
position the cursor on a particular line is sent 
directly to the text editor application, rather than 
to a data object such as the line. An environment 
with such a split focus avoids the expense and com- 
plexity of presenting a complete object-oriented 
interface to the environment and results in the 
existence of both application- and data-oriented 
paradigms. 

Regardless of which paradigms and applications 
a CASE environment uses, the primary focus of the 
environment is on the objects and on the opera- 
tions that are defined on those objects. Therefore, 
after determining what functions to offer, the sec- 
ond step in designing a CASE environment is to 
understand how applications, data, and operations 
are modeled using an object-oriented approach, in 
particular the one provided by ACA Services. 

CASE Integration in 
Object-oriented Terms 

Describing environments using object-oriented 
techniques can simplify the design of an environ- 
ment. Techniques such as abstraction and poly- 
morphism can be used to describe the objects 
that comprise the environment, the operations that 
can be performed on those objects, and any rela- 
tionships that exist between objects. Further- 
more, using these techniques makes it possible to 
describe an environment as a set of classes and ser- 
vices for each class. ACA Services performs the role 
of the method dispatcher, matching an object and 
an operation with the function in an application 
that can implement that operation. To realize the 
benefits of this approach requires constructing 
models for the applications, data, and operations 
that will be present in the environment. 

Modeling Applications and 
Application Relationships 
Applications that are integrated into an environ- 
ment can provide various functions or services to 
other members of the environment. The number of 
services an application provides depends not only 
on the capabilities of the application but also on 
the way it is modeled. These services are stand- 
alone pieces that can be plugged into a system to 
perform specific functions. An application can 
define a single operation whose sole function is to 
start the application; an application can export the 
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entry points of its callable interface; or an applica- 
tion can define sets of operations for each type of 
object it manipulates. In support of application 
modeling, ACA Services provides the concepts of 
application classes, methods, and method servers. 
Figure 1 illustrates the relationships among the var- 
ious pieces of information used to model an appli- 
cation in ACA Services. 1 

In ACA Services, the definition of an application is 
divided into two pieces: interface and implementa- 
tion. The interface definition is concerned with the 
publicly visible aspects of the application. These 
include class definitions for the objects that the 
application manipulates, a class definition for the 
application itself, and definitions of operations that 
the application supports. The operations, which 
represent the functions provided by the applica- 
tion, are modeled as messages on the application 
class definition. These messages define a consistent 
interface to various implementations of the opera- 
tions. Placement of the application class definition 
affects the behaviors this definition inherits. This is 
sometimes called classification. The classification 
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Figure 1 ACA Services Metadata Model 



of each component of an application depends on 
whether a component contains a superset or a sub- 
set of the functions contained in the components 
of other applications in the environment. 

Once the application's components have been 
classified, the integrator must determine how the 
application will make its capabilities available to 
the environment: as an operating system script, as a 
callable interface, or as an executable image. The 
implementation definition represents the actual 
implementation of the application. An application 
may comprise a number of executable files and 
shared libraries. Typically, only the executable file 
used to start the application is modeled as a method 
server. If the functions of the application are pro- 
vided through a shared library or image, only the 
shared library is modeled as a method server. 

The implementation of the functions or services 
exported to the environment are modeled as meth- 
ods. Methods describe the callable interfaces or 
operating system scripts that implement a particu- 
lar operation and are associated with only one 
method server. 2 During the method selection pro- 
cess, the messages defined for the application and 
the objects it manipulates are mapped onto one or 
more methods. 

Modeling Data and Data Relationships 
Data modeling is another significant aspect of creat- 
ing CASE environments, especially environments 
that utilize a data-oriented paradigm. Identifying 
the data objects that the application uses is a key 
element in the process of integrating that applica- 
tion. The list of data objects should include those 
objects for which the application provides a ser- 
vice, as well as those objects on which the applica- 
tion makes requests. The variety and quantity of 
data objects can vary from application to applica- 
tion and depend on an application's capabilities 
and the paradigm utilized. To support the modeling 
of data objects, ACA Services uses the concept of 
data classes. Note that, rather than provide instance 
management for data objects, ACA Services pro- 
vides a means to represent the data classes used by 
an application as metadata. 

Because environments that utilize a data- 
oriented paradigm may contain many data classes, 
ACA Services organizes the data classes into an inheri- 
tance hierarchy. This hierarchy allows responsi- 
bilities, such as operations and attributes, to be 
inherited by other data classes. Data classes found 
in an ACA Services inheritance hierarchy are related 
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to one another through an 'is-kind-of relationship. 
A class that has an "is-kind-of" relationship with 
one or more superclasses must support all opera- 
tions defined on the superclasses from which it 
inherits.* A subclass is not limited to those opera- 
tions and attributes defined by a superclass but may 
have other operations, as well as refinements to 
inherited operations and attributes. 

Modeling Operations 

As mentioned previously, operations are modeled 
as messages in the CASE environment. The name of 
the message describes the type of operation. Some 
messages are data oriented, i.e., Edit, Reserve, and 
Copy, whereas other messages are application ori- 
ented, i.e., ExecuteCommand and TerminateServer. 
Messages provide a consistent abstraction of the 
functions provided by applications. This abstrac- 
tion allows the details of how a function is 
implemented to be hidden from the requesting 
application. Since ACA Services supports more than 
one implementation for a single message, it also 
provides a means to hide various implementations. 

The developer should anticipate different imple- 
mentations of a message within the environment 
and be aware that a message may apply to a variety 
of classes. The developer must consider how the 
operation on an object might be used by various 
applications and in future environments.* In this 
way, adding new types of objects to an environment 
requires only minor changes, if any, to applications 
that are already integrated. 

Operation Interactions The semantics of a mes- 
sage dictates which particular interaction model is 
to be used. ACA Services can be used to construct 
a number of different interaction models: syn- 
chronous request, asynchronous request, and 
request/reply, as shown in Figure 2. The syn- 
chronous request interaction model, shown in 
Figure 2a, is useful when serial operations originate 
from a single source. This model blocks the execu- 
tion of the client application during a request. 
Control is returned to the client application only 
after the server application receives and executes 
the request and outputs data, if any. 

The asynchronous request interaction model, 
shown in Figure 2b, is useful in situations where 
the client can process other work until the server 
application completes the request. This model is 
especially beneficial when the requested operation 
takes a considerable amount of time to complete or 
if the server is busy with other requests. Execution 



of the client application is blocked only for the 
amount of time required to deliver the request. 
Client execution resumes once the request has 
been delivered. Upon completing the processing of 
the request, the server application notifies the 
client application of the completion and returns 
any output data. 

The request/reply interaction model, shown in 
Figure 2c, is most appropriate for requests whose 
implementations cannot perform the operations 
required to obtain the necessary output data. 
Gateway and message-forwarding applications are 
examples of applications for which this type of 
interaction model is best suited. In this model, the 
message that represents the request cannot have 
any output arguments and must pass an application 
handle to itself. The server application uses the 
application handle to return any output informa- 
tion to the requester by sending a message that rep- 
resents the reply. In a request/reply model, a single 
reply message should be defined for returning infor- 
mation, thus reducing the number of messages an 
application must support. 

Message Arguments A message argument for 
passing the object being manipulated need not be 
defined. ACA Services automatically passes the 
object to which the message was sent to the 
method. Each method routine can access the object 
through a structure containing context informa- 
tion for the current invocation. 

The arguments of a message should not be 
designed around a specific instance of an applica- 
tion, nor should they imply how an object is physi- 
cally stored. To help meet these design criteria, all 
references to an object should be passed as instance 
handles. In this way, the application that receives 
the instance reference can use it directly for sub- 
sequent operations on that object. In addition, 
when defining the message arguments, developers 
should consider other applications that could be 
instances of a particular class and possibly used as 
replacements. 

However, all instances of an application do not 
have the same set of capabilities. To support the var- 
ious capabilities, the developer may have to define 
additional arguments to represent bit masks and 
flags. An argument list or an item list can be used 
to pass information about different data types or 
quantities. The message design should not require 
implementation-specific information for proper 
application operation; this design implies that rea- 
sonable defaults accommodate any unspecified 
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information. In cases where proper operation of an 
application requires implementation-specific infor- 
mation, the most suitable design is to use the con- 
text object as a place to store the default values. 
With such a design, the application no longer needs 
to use hard-coded default values and can be cus- 
tomized for the environment. 

Integration Frameworks 

A number of issues must be resolved in the con- 
struction of a CASE environment before the first line 
of code can be written. Many of these issues center 



around the modeling of objects in the environment. 
As discussed in the previous section, abstraction is 
used to hide much of the actual implementation of 
the operations on objects from the requesting 
application. However, additional context may be 
required for further operations. If the application is 
using an application-oriented paradigm, most oper- 
ations are directed to an application class that pro- 
vides the service. In cases where a data-oriented 
paradigm is used, the application typically directs 
operations to the data class of which the object is 
an instance. 



88 



Vol. 5 No. 2 Spring 1993 Digital Technical Journal 



CASE Integration Using AC A Services 



Besides the application and data objects found in 
the environment, the designer must also take into 
consideration the other components of the CASE 
environment itself. Figure 3 shows the major com- 
ponents of a CASE environment: activities, applica- 
tions, application and data interfaces, work flow 
management, and handle management. Each com- 
ponent represents a particular aspect of the overall 
environment. The components are introduced in 
this section and described in detail elsewhere in 
the paper, as indicated. 

Activities provide the basic work structure for a 
particular task within an environment. Each activ- 
ity comprises one or more applications and a num- 
ber of data objects, forming a single composite 
object. Applications within an activity operate 
through the application interfaces. The section 
Application Integration describes the principles of 
an activity and includes a discussion of the sharing 
of applications within and among other activities. 

Application interfaces, illustrated in Figure 3 as 
arrows connecting the various applications, form 
the primitives by which integration is accom- 
plished. Some of the more general concepts for 
application interfaces were discussed in the sec- 
tion Modeling Operations; these concepts are 
described in detail in the section Styles of 
Application Interfacing. 

Finally, the section Environment Management 
addresses how to manage the flow of work within 
the environment. This section describes the 
management of instance and application handles, 
the use of storage classes as a means to provide 
data transformations, and the management of 
events within the environment. To better under- 
stand each of these topics requires the follow- 



ing basic information about various aspects of the 
environment. 

Adding New Implementations 
Updates to the environment may include adding 
new application classes, data classes that the new 
application supports, method definitions for the 
application, and possibly a method server defini- 
tion. As described earlier in the paper, ACA Services 
uses data and application classes to represent the 
different classifications of data and application 
objects found in an environment. Storage classes 
represent the classifications of storage and how 
objects are referenced in the environment. Each 
class, i.e., data, application, and storage, contains a 
list of messages that represent the operations that 
can be performed on the class. 

Digital's CASE environment, COHESION, was 
designed to present a data-oriented perspective to 
the user. An initial level of integration was achieved 
by utilizing this same data-oriented approach to 
application integration. Implementation of a data- 
oriented approach required that method maps for 
messages on data classes contain an indirect refer- 
ence to an abstract application class. ^ Figure 4 illus- 
trates this concept by showing two different 
messages: the Edit message, which uses an indirect 
method reference, and the Browse message, which 
uses a direct method reference. An indirect method 
reference has two parts separated by the character 
'@': first, the name of the message to be sent; and 
second, the name of the class on which to send the 
message. Although not commonly done, an indirect 
method reference allows the original message to be 
mapped to another message on a different class, 
given that both messages have arguments of the 
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same type, direction, and order. Both messages 
must also return the same type of object. 

On encountering an indirect method reference, 
ACA Services first looks at tables in the context 
object for an attribute that matches the reference. If 
such an attribute is found, ACA Services uses the 
attribute value to determine the class and message 
that should be checked next. Thus, users can pro- 
vide a mapping to their preferred application for the 
operation. If no matching attribute is found, ACA 
Services uses the message and class specified in the 
indirect method reference as the next place to 
check. 

The approach used in COHESION has many advan- 
tages over specifying either a direct reference to a 
method or an indirect reference to a specific appli- 
cation class. This approach does not limit the user's 
ability to specify application preferences associ- 
ated with using direct references to methods, nor 
does it burden the installation of the application 
with determining all the data classes that will need 
to be updated (as required with indirect references 
to a specific application class). In addition, the 
approach allows the application developer to do the 
least amount of work and still provide the maximum 
level of support for user preferences in applications. 

Using ACA Services, the application developer 
must create an application class definition for each 



CASE application to be added. Consequently, the 
class hierarchy contains both abstract and instance 
classes. The application class is required to contain 
all the messages defined on its superclass, plus any 
additional messages that the application supports. 
The method map of each message on an application 
class should contain a direct reference to the 
method that implements the operation. Although 
better than the other alternatives, the COHESION 
approach has no default implementation unless one 
is explicitly specified in a context object. To over- 
come this problem, an entry for each message 
defined on the abstract application class must be 
created in one of the context objects. The values 
for these entries point to the corresponding mes- 
sage on the class of application used as the default 
implementation. 

Common Classes 

Common classes for a CASE environment provide 
CASE application developers with a description 
about how an application fits into the environ- 
ment, the behaviors the application must support, 
and the messages that result in those behaviors. 
The notion of plug-and-play in the environment 
is achieved through the use of common classes. 
An implementation that adheres to the descrip- 
tion of a particular class of applications can be 
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easily switched with another implementation that 
adheres to the same application class semantics. 

Programs like COHESION are working toward 
a set of common classes for CASE environments. 
The set currently defined contains classes for many 
types of data and applications found in CASE envi- 
ronments focused on the coding and testing phases 
of the software development process. A graphical 
view of the data portion of the hierarchy is shown 
in Figure 5. The hierarchy is partially based on the 
hierarchy found in ATIS, a standard for tool integra- 
tion, and utilizes the strength of the ATIS data 
model. 6 (Shaded boxes indicate the classes that are 
specific to ATIS.) Encompassing the ATIS model, the 
hierarchy presents a uniform data model for the 



integration of data throughout the CASE environ- 
ment. The set of classes, although not exhaustive, 
serves as a basis on which a CASE environment can 
be built. Extensions of the hierarchy will occur as 
new classes of applications and their associated 
data objects are integrated into the environment by 
independent software vendors, customers, and 
other CASE vendors. 

Most data classes are subclasses of the data class 
SOURCE_FlLE, because the initial data class imple- 
mentation was targeted at a CASE environment 
consisting of editors, compilers, builders, and ana- 
lyzers. Additional data classes for both file and 
nonfile objects will be added when applications 
that provide and manipulate these objects are 



DATA OBJECT 



ELEMENT 



NAMED 
ELEMENT 



VERSIONABLE 



FILE 



T 



CONTAINER 



LIBRARY 



CODE 

MANAGEMENT 



DIRECTORY 






l 


FILE 

DIRECTORY 



X 



RELATION 



EVENT 



VERSION 



VERSION 
RELATION 



CONTEXT 



PARTITION 



PERSISTENT 
PROCESS 



ACTIVITY 



AGGREGATE 



COMPOSITE 



X 



COLLECTION 



T 



BINARY FILE 



r 



OBJECT FILE 



X 



X 



TEXT FILE 



DIAGNOSTIC 
FILE 



EXECUTABLE 
FILE 



SOURCE FILE 



LISTING FILE 
I 



SCRIPT FILE 



LOG FILE 



COL 



CRL 



X 



HELP 



MACRO 



MESSAGE 



TPU 



UIL 



Note: Shaded boxes indicate ATlS-specific classes. 



Figure 5 Hierarchy of CASE Com mon Data Classes 



Digital Technical Journal Vol, 5 No. 2 Spring I99A 



91 



Application Control 



integrated into the environment. A number of data 
classes represent composite objects such as tests 
and activities. These data classes are used to hide 
how the object is physically stored in the environ- 
ment. Classes that represent composite objects 
have attributes with values that are actually other 
objects. For example, the test data class typically 
has attributes that represent the result of' a test run, 
an operating system script or program used to per- 
form the test, and a benchmark against which a test 
run is compared. Each of these attributes may have 
as a value a reference to the file object that contains 
the actual data. 

The portion of the hierarchy that is used to spec- 
ify application classes contains only abstract appli- 
cation classes, as shown in Figure 6. These classes 
provide structure, but more important, they define 
the operations that are inherited by any application 
that is an instance of a class. Abstract classes are 
provided for a number of the applications found in 
CASE environments that deal with the coding and 
testing functions. The hierarchy does not contain 
any classes that represent particular instances of an 
application. Such application classes exist only 
when applications are installed in the environment. 

Consistent Integration Interface 
Many CASE vendors are building products for a 
number of different environments, including elec- 
tronic publishing, office automation, computer- 
aided design, and computer-aided manufacturing, 
in addition to CASE. Therefore, vendors must decide 
how to integrate these applications into the various 



environments. Until now. most integration was 
accomplished by linking one application with 
another, which resulted in tightly coupled applica- 
tions. However, such applications tend to be unable 
to operate independently, without the other mem- 
ber. Also, each coupled member tends to have 
its own application programming interface (APr). 
Integration performed in this manner results in an 
application that must maintain code to support 
multiple APIs, if the application is to work in a num- 
ber ot environments. Such support can increase the 
maintenance cost and the time and effort required 
to integrate with other implementations of applica- 
tions and environments. Other by-products of this 
approach are an increased image size and a need to 
rerelease software when a dependent application 
changes. The degree to which rerelease occurs 
varies with the platform and operating system. 

ACA Services can be used to minimize the num- 
ber of interfaces that an application must maintain 
without removing functionality; a common AIM 
provides the interface to all potential functionality. 
The ACA Services API, along with a set of com- 
mon classes, allows the same level of interaction 
between applications that can be accomplished 
through a private API, without the negative side 
effects previously described. Through the use of 
common classes, an application can integrate with 
multiple implementations of another application 
without requiring a separate effort for each. On 
platforms where dynamic loading of libraries or 
shareable images are supported, applications can 
use ACA Services to locate the appropriate library, 
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find the proper entry point, and transfer control to 
the appropriate routine. ACA Services also provides 
a transparent mechanism for encapsulating applica- 
tions that have no callable interfaces. Use of this 
mechanism extends the number of applications 
that can be integrated and removes the need to 
develop operating system-specific code to start 
applications. 

Styles of Application Interfacing 

Creating an interface to an application that is to be 
integrated is different from integrating an applica- 
tion into an environment. Application interfacing 
deals with the public interface or interfaces that 
the application provides to another application. In 
turn, these interfaces provide the primitives that 
can be used in the integration of applications. 

Application interfaces can be created in various 
ways, with differing levels of effort. Software devel- 
opers can design new applications to utilize all the 
capabilities of ACA Services. Existing applications 
can also take advantage of the full capability of 
ACA Services, if the source code to the application 
is available and if the application can be easily 
adapted to use an event-driven model. However, 
even if the source code to an application is not 
available, applications can still be integrated into 
the environment using ACA Services. If the applica- 
tion has a cal lable interface, a server can be written 
that receives messages and calls the appropriate API 
routines. If the application does not have a callable 
interface, the application can be integrated by 
encapsulation through the use of an operating sys- 
tem script. The remainder of this section describes 
how to use each of these techniques to create an 
interface through which the application can be 
integrated into a CASE environment. 

Application Modifications 
An existing application can easily be adapted to use 
ACA Services, if the source code to the application 
is available. With minimal changes, an application 
that utilizes an event-driven design, like that used 
by most window-based applications, can operate as 
an application server. The actual modifications 
required to provide ACA Services support differ 
across applications, but for most window-based 
applications the changes are similar. As an illustra- 
tion of this style of integration, consider an editor 
Most editors are implemented as event-driven 
applications, which allows easy integration 



because the structure of the code requires no major 
changes. To register the current executing instance 
of the application with ACA Services, a call to the 
ACAS_RegisterServer routine must be added to the 
application's initialization routine. During the pro- 
cess of run-time registration, ACA Services registers 
various information about the application, includ- 
ing the identifier of the process in which the appli- 
cation is executing, the owner of the process, and 
the class- and instance-unique identifiers for the 
application. As part of the registration, an applica- 
tion can specify an abstract name by which it can 
be located and the routines to be called when an 
ACA Services event arrives, e.g., when the server 
is instructed to shut down or when a session ends. 

Once registered with ACA Services, the applica- 
tion must enter its event dispatching loop. Because 
many applications have existing event dispatching 
mechanisms, ACA Services has been designed (or 
easy integration with most mechanisms. ACA 
Services provides this support by allowing the 
application to define a routine called the event 
notif ier, which is called at signal level each time an 
ACA Services event occurs. The event notifier rou- 
tine places an event on the applications work 
queue for the ACA Services event. Upon encounter- 
ing the event, the applications event dispatcher 
routine calls the ACAS_Dispatch routine to allow 
ACA Services to dispatch the appropriate method or 
management routine for the event. A description of 
how ACA Services dispatches operation requests 
fol lows. 

Application Servers 

When the application to be integrated does not 
have a user interface but provides a cal lable inter- 
face, integration is best accomplished by creating 
an application server. Considered a form of encap- 
sulation, an application server provides a consis- 
tent programming interface to the application. An 
application server provides jacket routines that use 
the application's callable interface, hiding the 
actual details of this interface. This technique is also 
used to create applications that have a clean separa- 
tion of presentation and functions. 

Applications that implement persistent data 
stores, such as databases, code managers, and 
repositories, are prime candidates for this style of 
integration. By using an application server to 
access persistent data stores, a requesting appli- 
cation need not know how the data store is 
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implemented and which implementation is to be 
used. This technique promotes the reuse of existing 
functions contained in the environment regardless 
of the actual implementation of the function. 
Digital's Code Management System (DEC/CMS) and 
CDD/Repository software are examples of applica- 
tions that have been integrated using the appli- 
cation server technique. Figure 7 illustrates the 
typical structure of the various components 
invol ved in this style of integration. 

As shown in Figure 7, the integration process 
involves the following steps. (1) An invoke from the 
client application of the message "Reserve" on the 
object "foo.c" goes through the resolution code and 
(2) out the transport to the server application. This 
may result in starting the server application, if no 
server was available to service the request. (3) The 
server application's main routine calls the event 
dispatcher and waits for work to arrive, when the 
server is started. (4) When the "Reserve" message 
arrives on the transport, the transport notifies the 
server application, (5) causing the event dispatcher 
to dispatch the "Reserve 1 message by calling the 
method dispatcher routine. (6) The method dis- 
patcher routine calls the appropriate method inter- 
face routine. (7) The method interface routine does 
any work required to call the appropriate callable 
interface routine. (8) When the callable interface 
routine returns control to the method interface 
routine, the routine can perform any work neces- 
sary before (9) returning control to the method 
dispatcher routine. (10) The method dispatcher 
routine then puts any arguments to be returned in 



the proper format and sends this information to the 
transport, which actually sends the information 
back to the client application. 

Using the DEC/CMS application server as an exam- 
ple, the software developer must create a main rou- 
tine to (1) perform any setup required to use the 
callable interface and (2) register the existence of 
the server with ACA Services. Registration includes 
specifying the method dispatcher routine, which is 
generated by ACA Services, so that the appropriate 
method routine will be dispatched for the message 
received. 

A method routine exists for each operation that 
the server is capable of performing. The set of 
method routines is analogous to the operating sys- 
tem script for compilation used to explain applica- 
tion encapsulation later i n this section. Because the 
DEC/CMS application server is not an operating sys- 
tem script, message arguments are passed into the 
method routine directly. As mentioned earlier in 
the section CASE Integration in Object-oriented 
Terms, the object on which the current operation is 
to be performed is available to the method routine 
through the use of the invocation context struc- 
ture. Information about the object, such as its class, 
name, and generation, can be obtained by calling 
the ACAS_ParseInstanceHanclle routine. The class 
of the object can then be used to determine if the 
object is an element under version control, a collec- 
tion, or a group. 

The name of the object and its generation are 
contained in the reference data field of the instance 
handle that represents the object. Because each 
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Figure 7 Block Diagram of a Code Management System Application Server 
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different code management system has its own 
representation of generation, it was necessary to 
create a canonical format to represent all imple- 
mentations. Therefore, the method must convert 
the canonical generation representation to a format 
that is native to the implementation, i.e., DEC/CMS 
specific. In addition, any method that returns a ref- 
erence to a versioned object must convert the 
native generation representation to its canonical 
format. Table 1 shows how an object reference can 
be mapped between its canonical and DEC/CMS- 
specific formats. 

Once the necessary information about the object 
has been retrieved and converted to a format native 
to the implementation, the method can call to the 
appropriate callable interface routine, possibly 
based on the object's data class. Once the call com- 
pletes, the method must convert any objects to be 
returned into a canonical format, at which point 
the method can return the status of the operation 
and output arguments. 

Application Encapsulation 
Encapsulation, the simplest integration technique, 
is appropriate for applications that do not have a 
callable interface or in cases where no source code 
is available. Compilers are an ideal candidate for 
this style of integration, because they perform syn- 
chronous operations. Encapsulation of compilers 
provides a consistent programming interface to any 
compiler that is integrated into the environment, 
regardless of the qualifiers used to specify particu- 
lar compilation options. This technique can also be 
used to provide a generic compile command that is 
platform independent. Encapsulation of a compiler 
is best accomplished through the use of an operat- 
ing system script. Figure 8 illustrates an example of 
an encapsulated compiler. 
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APPL7CONT GET ARGUMENT DEBUG/VALUE = DBG 
IF DBG = "TRUE- 
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DBG.QUAL = "/DEBUG" 
ENDIF 

CC 'P1 'DBG_QUAL 



Figure 8 Example of an Encapsulated Compiler 

The purpose of an operating system script for 
compilation is to convert the generic compilation 
qualifiers, which are passed as message arguments, 
into the compiler-specific options. The /DEBUG 
and /NOOPT qualifiers shown in Figure 8 are exam- 
ples of generic compilation qualifiers. iMany operat- 
ing system scripting languages limit the number of 
parameters that can be passed on the command 
line. The compilation scripts avoid these limita- 
tions by passing the name of the file to be com- 
piled as the only command line parameter, as 
shown in the command ®SY5$LIBRARY:COMPILE.COM 
%INSTANCE() in Figure 8. ACA convenience com- 
mands, such as APPL/CONT GET ARGUMENT, are used 
to retrieve and set the values of the message argu- 
ments in the operating system script. When all the 
switch values are gathered, the operating system 
script converts the generic values into specific 
qualifiers. Finally, the actual command line is con- 
structed and executed. This same technique can 
also be used to encapsulate linkers and any other 
types of applications where no source code or 
callable interface is available. When applications 
provide a callable interface, even tighter integration 
can be achieved by creating an application server. 

Application Integration 

Integration of applications goes beyond the inter- 
faces that applications present to the environment; 
it concerns how applications interact with one 
another. Integration also takes into account the 
policies used in an environment to allow a collec- 
tion of applications to be grouped into a single 
composite object. This section discusses concepts 
such as an activity, locating an application within 
an activity, context sharing, and the sharing of 
applications across multiple activities. 
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Activity Participation 

Since more than one activity may be active at 
any given time, an activity must be able to locate 
the other applications participating in the activ- 
ity. Data-oriented environments provide a means 
to loosely couple the various data and applica- 
tion objects into a single composite object. The 
COHESION integrated environment refers to this 
composite object as an activity. The implementa- 
tion of an activity differs depending upon the envi- 
ronment: ATIS uses a persistent process; file 
system-based environments generally use a direc- 
tory hierarchy; and environments built on a private 
data store can use a data file. In the COHESION envi- 
ronment, an activity is represented as an ACA 
Services context object that contains attributes that 
reference a directory hierarchy. The context object 
is used to set up the execution environment in 
which a set of applications will operate and to 
locate other applications that are executing within 
the activity. 

Locating Activity Applications 
The ability to locate an application that is executing 
in an activity aJlows for reuse of the application by 
other applications executing in that same activity. 
Such locating provides for better utilization of 
applications and reduces the amount of context 
that must be propagated from one application to 
another. To locate an application within an activity, 
an application must have registered its presence in 
the activity. When registering with ACA Services, 
the application must specify the activity name as 
the value of the attribute ACAS_SERVER_REGISTRY. 
The application must also register itself with the 
event manager to allow centralized management of 
the activity and to participate in the flow of work 
within the activity. 

CASE applications determine if they are execut- 
ing within an activity by checking for the existence 
of the environment variable ACT 1 V1T Y_NAM E . If this 
environment variable exists, its value is the activity 
identifier. To allow an activity to extend beyond a 
single host and to support different activities with 
the same name, the activity is identified by a unique 
identifier. 

Sharing within Activities 
Applications executing within an activity operate 
in a common context. ACA Services provides a set 
of mechanisms that can be used to provide 
this common context. The environment variable 



ACTIV1TY_NAME is defined each time a method 
server is started in the COHESION environment. The 
method server definition specifies as the value of 
the start-up environment attribute, the names of 
the context tables and attributes that are to be 
defined as environment variables upon start-up. 

Another way of providing a common context 
across an activity is to propagate context object 
tables and attributes as implicit arguments to 
method servers. Specifying this information as 
implicit arguments instructs ACA Services to propa- 
gate these attributes to the context object of the 
method server servicing the request. 

The context object can also be used directly to 
create a common context across an activity, i.e., by 
holding information that needs to be shared. This 
information can include references to directories, 
preferences of applications, and default values. 

Sharing between Activities 
Reusing applications that are active within an activ- 
ity reduces the overall system resources required to 
perform the activity. However, a problem occurs 
when two or more activities are active at the same 
time and require the same application. With the 
addition of windowed interfaces and the need to 
utilize other services, application sizes have greatly 
increased. Consequently, it is often impractical to 
expect a separate instance of an application to be 
associated with each activity that is active. 

In order for an application to be shared between 
multiple activities, the application needs a means 
by which to determine if a request is part of an 
ongoing dialog with another application or is the 
beginning of a new dialog. These dialogs, called 
"sessions," represent a conversation between a pair 
of applications. Each time a client application 
makes a request to a new application server, a ses- 
sion is established and an identifier is associated 
with the session. ACA Services passes the session 
identifier to the server appl ication. 

The management of sessions can be accom- 
plished by using the session It as a lookup key into 
a list of structures that represent the active ses- 
sions. When the server application locates the 
structure associated with the session identifier, the 
application can establish the appropriate context 
for that session. In the example of DEC/CMS applica- 
tion server, the structure would contain the handle 
to the Jibrary associated with the session. 

ACA Services also notifies an application server 
when a session is to be terminated between a client 
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and a server application. When notified, the appli- 
cation server determines the appropriate course of 
action. Using the CMS example, the server releases 
any cached information it has kept about the ses- 
sion, closes the specific CMS library, and then frees 
the library data block. 

Environment Management 

After defining application interfaces and integrat- 
ing applications into an activity, CASE environment 
developers must focus on the management of the 
environment as a whole. This includes the manage- 
ment of references to applications and data, the 
transformation of object references into platform- 
specific formats, and the flow of work within the 
environment. 

Handle Management 

In the CASE environment, objects are the targets of 
all operations. Sending a message to an object 
requires understanding how to create and manage 
references to the object. Since ACA Services does 
not manage instances of objects, it uses references 
to instances of objects. These references take the 
form of instance and application handles, which 
reference data and application objects, respec- 
tive!)'. Proper management of these handles leads to 
more efficient use of application objects, thus 
reducing the amount of network resources and 
memory consumed by the application. Appropriate 
handle management can also enhance performance 
and guarantee predictable behavior. 

Instance Handles 

The creation of an object reference is performed by 
calling the ACAS_CreateInstanceHandle routine. 
ACA Services (1) creates an instance handle from 
the information passed as arguments to the routine, 
(2) al locates memory to the handle and manages 
this memory, and (3) sends a message to a storage 
class, if one was specified. 

To avoid creating numerous copies of an instance 
handle, each with its own memory, a cache 
of objects should be used. This is especially 
true in CASE environments that use the data- 
oriented paradigm. Each object structure con- 
tains pointers to both the previous and the next 
object structure in the queue. The structure also 
contains values for the location and reference 
data fields that were passed as arguments to the 
ACAS_CreateInstanceHandle routine and, thus, 



allows for the unique identification of an object in 
the cache across multiple hosts. In addition to the 
location and reference data, the structure contains 
a pointer to the instance handle returned from the 
call to the ACAS_CreateInstanceHandle routine. 
Reuse of the instance handle saves the time 
required to create the handle, including any over- 
head associated with using storage classes. Reuse 
also reduces the total amount of memory required. 
However, instance handles are not the only handles 
that require management; application handles need 
to be managed as wel I. 

Application Handles 

Application handles are references to appli- 
cation objects. Each application handle can 
represent one or more method servers. A method 
server can generate a handle by calling the 
ACAS_CreateApplicationHandle routine, or the 
ACAS_InvokeMethod routine can return an applica- 
tion handle as an output argument. As with 
instance handles, application handles can be 
passed as arguments to a message. Management of 
application handles is similar to the management 
of instance handles. Each entry in the cache of 
application handles contains the location of the 
application and the name of the class of appli- 
cation. The entry also contains a pointer to the 
application handle and a count of the number of 
outstanding references to the handle. Freeing an 
application handle results in the termination of all 
sessions between the client and any method 
servers referenced by the handle; it also releases all 
memory associated with the handle. 

Each instance handle should be associated with a 
corresponding application handle. This association 
allows the application handle to be reused when 
sending additional requests to the application con- 
cerning the data object. An application handle asso- 
ciated with a cache entry can be used to make the 
request. Failure to find the application in the cache 
could indicate that the appropriate invocation flag 
should be used to obtain an application when call- 
ing the ACAS_Invoke Method routine. 

As described, proper handle management can 
result in better performance, better resource uti- 
lization, and predictable behavior within the envi- 
ronment. However, handle management does not 
deal with how to create an object reference that, 
when presented to an application on a remote host, 
is in a format native to that platform. For this capa- 
bility, we must turn to storage classes. 
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Data Transformations Using 
Storage Classes 

Distributed CASE environments, whether homoge- 
neous or heterogeneous, must concern themselves 
with the representation of object references that 
are shared among different applications. File speci- 
fications exemplify this problem. Given multiple 
hosts, it is unlikely that two hosts have the same 
path to a specified file, even if both hosts arc of 
the same platform type. Consider the scenario in 
which Application A sends the Edit message to the 
file object $PROj4: [PROJECT. SRC] SORT.C. resulting 
in a request of Application B to edit the contents of 
the file. The problem becomes complicated if 
Application B is executing on a different platform 
type than Application A. 

To solve the problem, the environment can uti- 
lize the functionality provided by AC A Services stor- 
age classes. Storage classes provide a mechanism 
for translating an object's reference data from one 
file system representation to another. A solution 
to the scenario described involves implementing a 
set of methods that would be executed when the 
object reference uses a storage class. 

The SC_COHESION storage class is a CASE-specif ic 
storage class, which is a refinement of the SC_FILE 
storage class provided by AC A Services. As a refine- 
ment, SC_COHESION inherits all the messages defined 
on its parent storage class, including the messages 
Setlnstance and Getlnstance. The methods for these 
two messages provide an implementation for map- 
ping file system specifications from platform- 
specific formats to platform-independent formats 
and back again. The storage class methods do this by 
utilizing device and directory information, called 
directory mappings, found in the context object. 

The directory mappings stored in the context 
object provide a means to associate a physically 
shared directory path with a network path name. 
The network path name is a platform-independent 
name that, when presented to a remote platform, 
can be mapped into a format native to the platform 
receiving the request. A network path name and its 
mapping are stored as an attribute-value pair in the 
PATH NAM E_REGISTRY table of a context object. 

The directory mapping functionality allows ref- 
erences to file objects to be passed between appli- 
cations on different hosts in a way independent of 
the platform. This same scheme can also be used to 
convert object references in object identifiers, such 
as ATIS element IDs for use with the CDD/Repository 
software. In the implementation for the file system, 



the method associated with the Setlnstance mes- 
sage must determine the data class of the object ref- 
erence, as well as transform the reference data into 
its network format. The determination can be made 
in a number of ways, the most common of which 
is to base the class on the extension of the file. 
Although not the most accurate method of deter- 
mining the class, this approach does meet the needs 
of many files. 

Work Flow Management 

ACA Services manages the various instances of exe- 
cuting applications but does not understand the 
concept of an activity. Therefore, managing the 
applications within the activity requires the use of 
an application that understands this concept. The 
event manager, which acts as a central registry of 
active applications and their associated activities, 
can provide a simple form of work flow manage- 
ment within the environment. However, the event 
manager is used only in a limited capacity in the 
COHESION integrated environment. In COHESION, 
the event manager is notified each time an applica- 
tion is started or stopped in an activity. The applica- 
tion provides an application handle to itself, which 
is used by the event manager to notify the applica- 
tion of events of interest. The use of the event man- 
ager removes the need for an application to forward 
certain messages, as a result of an event in the envi- 
ronment, to all applications with which it has been 
communicating. Removing the need to forward 
messages reduces both the chances of loops form- 
ing in a set of applications and any communication 
deadlocks between applications. 

Events and Triggers 

On registration, an application can express interest 
in being notified about particular events. Events 
are categorized into two classes: system events 
and application events. System events affect the 
overall operation of the environment. These events 
include shutdown and changes in activities. All 
applications in the COHESION environment are 
notified of the system events for activity shutdown, 
iconification, and deiconification. Application 
events occur when the state of an object in the envi- 
ronment changes. File modification or completion 
of a build step are typical examples of application 
events. Other applications in an activity can use 
these events for synchronization or as notifications 
that cause a change in behavior. Such notifications 
have traditionally been called triggers. 
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For example, in a simple build system such as the 
make utility, events can create a work flow that 
would automatically compile and link an applica- 
tion when one module changes. If the build process 
completes successfully, the work flow automati- 
cally starts the debugger to debug the newly built 
executable file. If the build fails, the work flow 
loads the faulty module into a program editor and 
positions the cursor to the line where the error 
occurred. 

Summary 

ACA Services can be used to resolve many problems 
encountered in a distributed, multivendor environ- 
ment. The object-oriented approach provided by 
ACA Services can aid in the construction of a CASE 
environment that promotes the plug-and-play con- 
cept across a number of different platforms and 
network transports. ACA Services provides a means 
of developing client-server applications and of 
abstracting the network dependencies away from 
the developer. This feature, together with the use of 
storage classes and data marshaling, can help to 
exchange information in a heterogeneous environ- 
ment. At the same time, ACA Services can provide a 
consistent programming interface to all compo- 
nents in the system. The dynamic nature of ACA 
Services allows new components to be added to the 
environment without the need to rebuild the entire 
environment. The flexibility of ACA Services allows 
its use to construct a CASE environment regardless 
of the integration paradigm used and while sup- 
porting a number of interaction models. ACA 



Services provides the infrastructure necessary to 
integrate the large number of existing applications 
into distributed, heterogeneous environments. 
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DEC @aGlance — Integration of 
Desktop Tools and Manufacturing 
Process Information Systems 

The DEC @aGlance architecture supports the integration of manufacturing process 
information systems with the analysis, scheduling, design, and management tools 
that are used to improve and manage production. DEC @aGlance software com- 
prises a set of run- time libraries, an application development tool kit, and exten- 
sions to popular spreadsheet applications, all implemented with Digital's 
object-oriented Application Control Architecture (ACA ) Services. The tool kit helps 
developers produce DEC @aGlance client and server applications that will inter op- 
erate with other independently developed DEC @aGlance applications. Spreadsheet 
extensions (add-ins) to Lotus I -2-3 for Windows and to Microsoft Excel for Windows 
allow users to access real-time and historical data from DEC @aGlance servers. With 
DEC @aGlance software, control engineers and other manufacturing process profes- 
sionals can use familiar desktop tools on a variety of platforms and have simple, 
interactive, and transparent access to current and past process data in their plants. 



At a chemical plant that has been producing nylon 
using the same process for over 35 years, the lead 
control engineer told an interviewer that what he 
likes about his job is that "it is totally different every 
day" 1 To an outside observer, the operation of a 
process plant, such as a refinery or paper plant, 
appears to be an unchanging flow of materials 
into a tightly controlled and repetitive process 
that produces a continuous flow of unvarying 
product— 24 hours a day, 365 clays a year. In reality, 
the operation of these plants is far more complex 
and challenging, involving constant adjustment to 
changing conditions, aging equipment, and varia- 
tions in raw materials, as well as constant monitor- 
ing for equipment malfunctions. 

The operation of a large process plant involves 
the functioning of numerous valves, switches, 
pumps, other actuators, and sensors measuring and 
controlling the levels, pressures, temperatures, and 
flows of various materials through a complex series 
of pipes, tubes, tanks, and vessels. In addition to 
detecting and managing failures in these compo- 
nents, a large proportion of the personnel in the 
plant is involved in process and product improve- 
ment. The personal computer or workstation 
and an array of sophisticated desktop tools allow 



data to be analyzed, visualized, manipulated, and 
explored in ways that support creative problem 
solving. Getting timely information about the pro- 
cess into the appropriate problem-solving tools is, 
however, difficult. This paper begins with some 
background about manufacturing process infor- 
mation systems and the need for access to system 
data. The paper then describes the development of 
DEC @aGlance software and the choice and use of 
Application Control Architecture (ACA) Services to 
solve the problem of integrating independently 
developed applications in the manufacturing 
space. 2 

Background 

In large manufacturing facilities, the production 
process is controlled through the use of advanced 
automation systems. These systems may track thou- 
sands of temperatures, flows, pressures, and levels 
and can drive hundreds of pumps, valves, and other 
actuators. To implement control strategies, such 
systems may compute large numbers of complex, 
dynamic control algorithms. Usually, additional sys- 
tems measure various physical properties of the 
product, such as color, weight, viscosity, thickness, 
and moisture content. Supervisory control systems 
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often coordinate parts of a complex process, as 
well as implement higher-level control and produc- 
tion strategies and keep historical records of key 
process variables. 

The control of a large plant is usually imple- 
mented through strategies that allow the control 
problem to be divided into smaller parts, as illus- 
trated in Figure 1. Each piece of the system is 
responsible for the control of a subsystem (e.g., 
steam generation and distribution, or cooling flu- 
ids), a part of the process (e.g. , premixing, material 
storage, or reaction), or an area of the plant (e.g., 
packaging line, product stream, or finished goods 
management). Within each subsystem, there is typi- 
cally a hierarchy of control. The lowest-level com- 
ponents control activities that require responses 
within less than a second to as much as one minute 
(direct control). The next level of systems control 
activities that require responses within less than 
a few minutes (distributed control). Above this 
level of response are systems that control activities 
that may not change for long periods or that imple- 
ment control algorithms that involve measurements 
from more than one lower-level system (super- 
visory control). At the plant level, additional 



control systems may exist to implement control 
algorithms that reflect changes in the markets for 
products, market opportunities, and fluctuations in 
raw material availability and composition, along 
with the information about the process that is sup- 
plied by the lower- level systems (high-level con- 
trol). Scattered among these levels may be various 
additional systems that schedule preventive main- 
tenance, identify equipment failures, and advise on 
process improvements — all based on information 
about process from the other systems in the plant. 

Distributed control systems include an operator 
console that consists of multicolor displays, push 
buttons, warning lights and buzzers, a touch screen 
or trackball, and industrialized keyboards with as 
many as a 36 special function keys. The displays 
allow an operator to oversee all parts of the process 
for which the operator is responsible. Typical dis- 
plays show recent trends of key variables and mimic 
diagrams showing the current state of the manufac- 
turing equipment (e.g., valve positions and tank lev- 
els) and of the material flowing through the 
process. The keyboard and other input devices 
allow the operator to select displays, request 
reports, and modify control settings. Response to 
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problem or alarm conditions and modification of 
the process to change the product are effected 
through the console. 

Process operators are responsible for maintain- 
ing the routine operation of a plant. Operators use 
the control system to change process parameters in 
order to produce different mixes or variants of the 
product, or to respond to an equipment failure by 
rerouting material around nonoperational process 
equipment. 

To perform their functions, manufacturing plant 
production and engineering support personnel 
(e.g., control engineers, process engineers, produc- 
tion supervisors, production planners, mainte- 
nance supervisors, and manufacturing engineers) 
also need access to information in the control and 
supervisory systems. These professionals regularly 
access information contained in multiple manufac- 
turing systems and have an occasional interest in 
particular measurements or parameters within 
other parts of the process. The functions of these 
manufacturing plant personnel include 

■ Complex problem analysis and solution. 
Locating sources of product or process variation 
involves analyzing information from different 
parts of the process that may be under the con- 
trol of different automation systems. Comparing 
the flow that exits one part of the process 
with the flow that then enters the subsequent 
part, for example, could disclose a faulty flow 
meter, a previously unknown temperature con- 
trol problem, or a leak. 

■ Product improvement. Improving product qual- 
ity and consistency involves investigating how 
the product is affected by existing variations 
in the production process. For example, investi- 
gation may involve the study of a process vari- 
able that cannot be measured directly but can be 
calculated from the values of other process vari- 
ables. Examining sets of variables over time and 
exploring possible relationships may result in 
discovering combinations of process variables 
that yield unexpected effects on product 
attributes. 

■ Process improvement. Improvements in process 
yield and process reliability and reduction of 
waste and hazardous by-products may involve 
the study of historical data values from the pro- 
cess. Studying measurements obtained from 
multiple control systems may also result in pro- 
cess improvements. 



■ Resource optimization. Usually, process plants 
are capable of producing different grades of 
product, as well as mixtures of end products. 
An oil refinery, for example, produces various 
grades of fuel oil and also home heating and 
lubricating oils, all from a single process. While 
the operators adjust the equipment to control 
the product mix, a process planner or produc- 
tion manager determines the best production 
schedule based on customer orders and the effi- 
cient use of the process equipment. 

Process information is available to operators 
and engineers who are trained to work with the 
various control and management systems in the 
plant. Using proprietary tools for each system 
allows reports to be generated and specific types 
of analyses to be performed on the data contained 
within each of these systems. However, extracting 
the data from these systems to an engineer's desk- 
top for analysis by generic tools, such as spread- 
sheets and statistical analysis packages, is difficult 
or even impossible. Lack of console- and tool- 
specific training is another obstacle to accessing 
process information. 

Manufacturing Process Information 
Systems and Desktop Systems: 
Goals and Barriers 

Production and engineering support personnel 
want to be able to use the desktop tools of their 
choice to explore and analyze data from manufac- 
turing systems. Spreadsheets, simulation tools, 
report generators, visualization tools, statistical 
analysis tools, planning tools, charting tools, and 
graphic-generation tools have all become accepted 
parts of the array of computer-aided techniques and 
tools available to the contemporary knowledge 
worker. The interactive, easy-to-use graphical user 
interface, which can run on relatively inexpensive 
platforms under the complete control of the end 
user, has not only encouraged the wide use of these 
desktop tools but also enhanced their effectiveness. 
These tools stimulate professionals to creatively 
explore the character of large amounts of data and 
thus support the discovery of previously unex- 
pected patterns and relationships. 

The further an end user's primary function is 
from production, the more likely it is that such a 
user will want access to multiple systems. System 
interfaces, which may differ widely and are gener- 
ally oriented toward production use, discourage 
users from making ad hoc inquiries into the system. 
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Consequently, manufacturing system data may not 
be easily accessible to users of the many desktop 
tools available for such purposes as decision sup- 
port, research, analysis, and simulation. 

Today, the use of data from the manufacturing 
process in planning, reporting, and managing the 
operation of a plant is hampered by the difficulty in 
accessing the data from plant control and process 
information systems. It is typical for a production 
supervisor who needs data from a control system 
to request the data from a process operator. Once 
in hand, the data is then manually entered into a 
spreadsheet or other desktop tool for analysis. The 
results of the analysis often require entering new 
parameter values into the control system. This task 
is typically performed by another person, trained 
to use the control system, who transcribes the val- 
ues from a hard copy of the tool's output. The pro- 
cess is time-consuming, costly, and error prone. 
Problem -solving activities are limited to those that 
can justify the trouble and expense involved in sim- 
ply accessing the data. 

Existing Integration Efforts 
The desire to use data from the control systems 
to analyze and improve the understanding and con- 
trol of the manufacturing process has spawned 
a variety of efforts since the late 1980s. This work 
has attempted to ease the transfer of information 
between computing systems and control systems. 
However, the resulting products and standards are 
not oriented toward supporting ad hoc inquiries 
and, therefore, are not widely used. 

Many currently available manufacturing systems 
may be connected to the plant network, but with- 
out standard higher-level interfaces, access to these 
systems remains limited/-" Through such network 
connections, some manufacturing systems pro- 
vide limited access to OpenVMS and/or DOS system 
users. However, the access is typically restricted to 
the use of unique, proprietary programming inter- 
faces or to proprietary tools targeted at performing 
a manufacturing-related function, such as statistical 
quality control. Usually, interfaces are supplied 
only on a specific operating system or on limited 
versions of a specific operating system. 

In some systems, it is possible to extract a table of 
data values into a file using a common representa- 
tion and file format (such as Lotus •evelopment 
Corporation's WKl) that can then be imported into 
a spreadsheet on an IRM-compatible PC. This tech- 
nique obviates the need for hard-copy output and 



simplifies transcription but still requires that a 
specialist extract the data using proprietary inter- 
faces. In addition, the data may need to be con- 
verted from string to numeric format to be usable 
within a particular spreadsheet. 

The International Organization for Standard- 
ization standard Manufacturing Messaging Speci- 
fication (IS9506 or MMS) addresses the problem 
of data exchange between applications and dedi- 
cated manufacturing systems (referred to in the 
standard as manufacturing devices),* 5 Although 
some manufacturers of programmable controllers 
(that is, dedicated control systems that are pri- 
marily used in discrete manufacturing industries) 
offer MMS capabilities, the process industry manu- 
facturers and their control system suppliers have 
not widely accepted MMS. Use of the standard 
has been perceived as expensive, inefficient, and 
oriented primarily toward the needs of discrete 
manufacturing. A committee of the Instrument 
Society of America (ISA) is developing a companion 
standard (ISA 72.02) to use with MMS in communi- 
cating with distributed control systems in process 
manufacturing. 9 An important aspect of this pro- 
posed standard is a data model that describes the 
organization and types of data in a distributed con- 
trol system. 

Requirements for Integration 

Digital designed the DEC @aGlance architecture not 
to be a generic application integration mechanism 
but rather to support the integration of popular 
desktop tools with manufacturing process informa- 
tion systems. An application that complies with the 
architecture can be installed on any system within 
a network, run, and immediately exchange data 
with other compliant applications. Some key char- 
acteristics of the environment that helped to drive 
the architecture are 

■ Multiple vendors. Although, MS-DOS personal 
computers are the most popular desktop envi- 
ronment, VAXstation, Macintosh, and UNIX work- 
stations have a clear presence in particular 
departments and in certain large customer sites. 

■ Multiple software developers. The applications 
to be integrated are products of many compa- 
nies that build manufacturing systems and desk- 
top tools. The software development groups in 
these companies focus on core application and 
human interface issues rather than on integra- 
tion issues. 
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■ A large variety of desktop applications and user 
interfaces. Each class of desktop application 
has a different way of interacting with users. 
Spreadsheets, for example, have very different 
user interfaces from statistical packages and data 
visualization packages. Some applications have 
elaborate macro languages, whereas others are 
almost entirely graphically driven. 

■ Multiple types of large networks. In the typical 
process manufacturing facility, large networks 
are already in place. While many plants use 
DECnet for their network, an increasing number 
of plants are choosing to use the transmission 
control protocol/internet protocol (TCP/IP), 
and some plan to migrate to Open Systems 
Interconnection (OSI) networks (including 
Digital's DECnet Phase V) from multiple vendors. 
PC LANs are also becoming popular. 

■ Conservative computing strategies. Large 
manufacturing facilities cannot afford to halt 
operation to make major changes in their 
production-related computing systems and net- 
works. Such facilities look to standards-based 
products as a way of achieving stability and of 
ensuring confidence in the longevity of a partic- 
ular technology. 

Architectural Issues 

Simply stated, the problem that the DEC @aGlance 
architecture attempts to address is, how can a 
set of existing applications running on heteroge- 
neous platforms, distributed across a variety of 
networks, and developed by different vendors 
(with only peripheral interest in integration) be 
easily integrated? A good understanding of both 
the nature of the applications involved and how end 
users would use them if they were integrated is 
important for evaluating potential answers to the 
question. 

The applications that we considered integrating 
can be divided into two groups: those that "own" 
manufacturing data, i.e., the manufacturing control 
systems, and those that are consumers of that data, 
i.e., the desktop tools. From the viewpoint of an 
end user, some aspects of the relationship between 
a desktop tool and a manufacturing control applica- 
tion must be considered in order to accomplish 
work goals. End users in this environment are 
primarily concerned about the manufacturing 
process, the equipment controlling the process, 



and the state of materials within the process. These 
users have little or no interest in such aspects 
as network topologies and protocols, operating 
systems, and byte ordering on different hardware 
platforms. 

Some major concerns of the end user that the 
architecture should address are 

■ The identity of the manufacturing control sys- 
tem. Generally, a large plant is controlled 
through the use of several control systems, each 
of which might control a part of the process, 
such as refining or packaging, or an aspect of the 
plant operation, such as steam distribution or 
waste reprocessing. A particular data point 
resides in a single manufacturing control system. 
The user should be able to specify precisely 
which manufacturing system is to supply the 
data values. The architecture should be capable 
of establishing a relationship with the specific 
application that owns the data of interest to the 
user. The end user should not have to specify 
either the network node, the operating system, 
or the hardware platform on which the applica- 
tion is running. Neither should the end user have 
to specify the network communication proto- 
cols required. 

■ The length of the relationship between the desk- 
top tool and the manufacturing control applica- 
tion. The relationship should be able to remain 
active for multiple transactions to allow end 
users to work interactively with desktop tools 
to explore possibilities. For example, end users 
may want to examine different data points or the 
same data point over various time intervals. 
Thus, usage of a desktop tool could involve mul- 
tiple requests for data from a manufacturing 
control application. Establishing a relationship 
between applications over a network is time- 
consuming, and therefore establishing long- 
lived relationships would be advantageous. The 
ability to continuously monitor a set of points 
and have their values reported on a time or 
change basis is another desirable feature that 
would require the establishment of long-lived 
relationships. 

■ Multiple access to the applications. Applica- 
tion relationships should not be exclusive. 
Each application should be able to have concur- 
rent relationships with several partner applica- 
tions. Each desktop tool may require data from 
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several manufacturing systems, and conversely, 
several users of desktop tools may need to 
access the same control system simultaneously. 
The relationships between desktop tools and 
manufacturing control systems is illustrated in 
Figure 2. 

The data model. Applications should agree about 
how to reference data and about data types. 
Within the context of this environment, a rela- 
tively simple data model exists in the draft stan- 
dard ISA 72.02. Data should always be converted 
to types appropriate to the local system and to 
the application. A spreadsheet user should not 
have to manually convert strings into numeric 
values. 

The user interface. Application integration 
should not require the use of any particular 
desktop user interface, such as the X Window 
System or DECwindows software, or even the 
existence of a windowing system. Also, the user 
interface of the manufacturing data application 
should be of no concern to the desktop user. 
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Usage Model 

To help us understand how a user might go about 
employing the capabilities that we were consider- 
ing, we developed a simple usage model. We based 
the model on the scenario that an end user makes a 
series of ad hoc inquiries into the state of a process. 
We assumed that the user was familiar with the 
manufacturing process but not necessarily expert 
in all the details of the process. The user would 
know, for example, what the major areas of the 
plant were called and what functions they per- 
formed but might not know the internal reference 
identifier of every flow meter in each control sys- 
tem. We focused on how the user of a spreadsheet 
tool might reasonably expect to proceed to get data 
into a spreadsheet and how services that we might 
provide could aid in exploring the data. 

The information within a manufacturing system 
consists of the many parameters and measurements 
that the system uses to monitor and control the pro- 
cess. Generally, this data is organized into blocks, 
each one related to a particular part of the process, 
such as flow, level, temperature, or pressure. As the 
typical data block in Figure 3 illustrates, every 
block has a unique name or tag that can be used for 
reference purposes. 

In control systems, tag names are assigned as part 
of the configuration. Large plants use a naming con- 
vention to ensure the assignment of unique tag 
names to the thousands of blocks spread through- 
out the plant and over several control systems. In 
addition to the tag, the block contains attributes 
such as the parameters of the control algorithm, 
measured input values, unit conversion algorithm 
identifiers. The data model proposed by the ISA 
72.02 committee describes seven types of blocks, 
each with a standard set of attributes with associ- 
ated names and data types. 
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Figure 2 Relationships between Desktop Tools 
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This usage model allows a user to easily deter- 
mine the tag names recognized by a particular 
manufacturing system. To examine the data values 
associated with a specific tag, the user needs to 
know the valid attributes. (AJ1 blocks do not have 
the same attributes, e.g., an analog loop control 
block has more attributes than a simple digital mon- 
itoring block.) Once the tag names and their valid 
attributes are known, the user can inquire about 
current values as well as historical values. 

The use of operating prototypes, including simu- 
lated servers and a simple spreadsheet, advanced 
the development of the usage model. The proto- 
types were shared with potential end users and 
application developers at customer visits and indus- 
try trade shows. Feedback obtained from demon- 
strations and discussions of the usage model helped 
expand and refine the services. 

Architecture 

The DEC @aGlance architecture defines two kinds 
of applications, a set of services for accessing data 
in the control systems, a data specification model, 
and some basic types of data. The application 
classes are (1) manufacturing data servers and 
(2) clients. Typical manufacturing data servers are 
the manufacturing control system applications. 
Typical clients include desktop tools such as 
spreadsheets and statistical analysis tools, as well as 
production planning, production scheduling, and 
other production management applications. An 
application may be a client in relation to one appli- 
cation and a server in relation to another. 

A data point is specified to DEC ©aGlance appli- 
cations by the name of a server, a tag name, and an 
attribute name. A data point has a current value and 
may also have historical values (if the manufactur- 
ing system has a historian capability). A current 
value is the most recent available value of a parame- 
ter or measurement within the system. A historical 
value is a value that the data point had at some time 
in the past. A historical value is specified by the 
name of a server, a tag name, an attribute name, and 
the time associated with the value. 

The services defined by the DEC ©aGlance archi- 
tecture fall into one of four functional categories: 
configuration information, data value exchange, 
monitoring, or management. Each service defines 
an operation that may be requested by one applica- 
tion of a partner application. The services defined 
are not necessarily the same functions that an end 
user requests. 



Configuration Information 
One service is defined for requesting the tag names 
that the server finds in the control system's 
database. An additional service returns a list of 
attribute names that are defined for a specified tag 
name or a list of tag names. 

Data Value Exchange 

Services are defined for reading and for writing 
current and historical data point values. For current 
values, services support reading or writing either 
a list or a table of data point values. A read or write 
list request specifies pairs of tag names and 
attributes. A read or write request for a table of data 
point values specifies a list of names and a list 
of attributes. The table of data points consists of 
all tag names paired with their corresponding 
attributes. Both the list and the table requests can 
be used to read or write a single data point, collaps- 
ing to either a list or a table of one datapoint. 

By using the DEC @aGlance services to get lists of 
tag names, attribute names, and data point values, 
and the name of a server, an end user can generate 
a wide range of ad hoc queries without knowing 
much about the control system in advance. A com- 
mon data point attribute is the descriptor, which 
characterizes the function of the data point, e.g., 
south tank level. Thus, it is a fairly straightforward 
task to use DEC @aG lance services to build a list of 
tag names and descriptors that provide a basis for 
further inquiries. 

The services for historical data values are defined 
to deal with tables of historical values for a list of 
data points. Historical data service requests specify 
a list of tag name and attribute pairs and a time 
specification that is applied to all the data points. 
The time specification consists of a start time, a 
time interval and the number of intervals for which 
values are to be returned. 

Monitoring 

Monitoring is useful for reading the values of a 
set of data points at intervals in time or when a sig- 
nificant change in value occurs for any of the data 
points. A graphical display program can run on 
a desktop system and make minimal use of the net- 
work and computing resources while maintaining 
an accurate representation of what is occurring in 
the manufacturing process. Monitoring could also 
be used to update a spreadsheet at regular time 
intervals or whenever a particular process variable 
changes. 
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No standard definitions exist for what consti- 
tutes a significant change in value. Definitions sup- 
ported for various systems include (a) detection of 
change outside of a specified range or "dead band," 
(b) change by more than some percentage of the 
previously reported value, and (c) change by more 
than some percentage of a fixed value. Therefore, 
the service is defined to support monitoring and 
reporting of changes on a time basis or on some 
other basis that is specific to the data server appli- 
cation. Whenever the requested monitor condition 
is fulfilled, the data server application uses a moni- 
tor update service to send the new data point val- 
ues to the original client application. Since the 
server initiates monitor update requests, the usual 
relationship between the client and the server is 
temporarily reversed. 

Management 

Connection management services are provided to 
establish a connection, to terminate a connection, 
and to test a connection. 

Implementation Considerations 

Using existing networking and application integra- 
tion technologies to implement the DEC @aGlance 
architecture was important both in terms of 
reducing development efforts and improving com- 
patibility with existing environments. Technol- 
ogy used in the implementation had to provide 
as many as possible of the capabilities described 
in the architecture while imposing minimal restric- 
tions on the end-user operating and network 
environments and on the developers of the appli- 
cations. In addition, it was desirable that the under- 
lying technologies offer capabilities that could 



support future enhancements to the DEC @aG lance 
architecture. 

The DEC@aGlance architecture allows an existing 
desktop tool to be integrated with existing manu- 
facturing control systems, as shown in Figure 4. The 
architecture effectively combines the functional 
capabilities of the desktop tool for analysis, visual- 
ization, computation, etc., with the capabilities of 
the manufacturing control system for monitoring 
and controlling a manufacturing process. The indi- 
vidual applications were, of course, originally 
designed and written without any knowledge of 
each other's existence. Therefore, to facilitate inte- 
gration efforts, implementation of DEC @aGlance 
software should localize and minimize required 
changes to the applications. 

A network protocol such as DECnet, the transmis- 
sion control protocol/internet protocol (TCP/IP), or 
one of the local area network (LAN) protocols could 
have provided the network services required 
for DEC ©aGlance's interapplication communica- 
tions. However, this approach lacks a mechanism 
for locating servers on the network, requires 
DEC @aGlance to support the multiple network 
protocols that exist in the manufacturing environ- 
ment, requires DEC (iaGlance to include data type 
conversion between application platforms, and 
necessitates the development of monitoring and 
management tools unique to DEC @aGlance. A bet- 
ter approach is to use an existing product that is 
available on an appropriate set of platforms, sup- 
ports an appropriate set of networks, and already 
solves these problems. 

A remote procedure call (UPC) mechanism 
appears to have many of the capabilities that 
the DEC @aGlance architecture requires. KPC 
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Figure 4 Integrating Desktop Tools and Manufacturing Systems 
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mechanisms provide for location of a partner or 
server application, and they provide data type con- 
version and reliable network services. The RPC 
model of application integration, however, is actu- 
ally more appropriate for the distribution of a single 
application across multiple systems in a network. 
This use implies a simple, static relationship 
between the parts of an application: one part is 
always a client that requests the execution of a pro- 
cedure, and the other part is always an RPC server 
that executes the procedure and returns the results. 
In such a relationship, each request generates a sin- 
gle response. This model would be poorly suited 
for supporting the DEC @aGlance monitoring ser- 
vice. When DEC @aGlance was being developed, no 
commercially available RPC implementation ran on 
the key platforms, the OpenVMS and Microsoft 
Windows environments. Furthermore, no one had 
announced their intention to produce a portable 
implementation that would be available on the 
wide range of platforms that we considered impor- 
tant for future versions of DEC @aGlance software. 

Digital's ACA Services was chosen as the basis 
for implementing DEC @aGlance software because 
it implements an application integration model 
that closely matches the requirements of the 
DEC @aGlance environment. ACA Services supplies 
many capabilities required of the integration mech- 
anism including 

■ Abstraction of functions from implementations 

■ The ability to encapsulate existing applications 

■ Location of partner applications on a variety of 
networks 

■ Establishment and management of reliable, long- 
lived communication links 

■ The ability to easily add new applications to the 
system 

■ The ability to easily install new versions of exist- 
ing applications in the system 

■ The correct handling of data type conversions 
between heterogeneous systems 

■ Commercial availability of portable interfaces 
on OpenVMS, Microsoft Windows, Macintosh, 
and a wide variety of UNIX platforms from multi- 
ple vendors 

The class hierarchy capabilities of ACA Services 
allow the creation of new combinations of appli- 
cations integrated to provide new capabilities 
without additional coding. Thus, a new class of 



server can be defined to offer the capabilities of 
a DEC @aGlance data server as well as additional 
capabilities. The older DEC @aGlance servers would 
actually provide the DEC @aG lance services while, 
transparent to the client applications, the new 
server would make the new capabilities available. 

ACA Services has been selected as a major com- 
ponent of the Object Management Group's (OMG) 
Object Request Broker, which in turn has been 
selected as a part of the Open Software Founda- 
tion's (OSF) Distributed Computing Environment 
(DCE). ACA Services is designed to be independent 
of the type of network that provides the interappli- 
cation communications services and currently 
works over both DECnet and TCP/IP networks, the 
networks most commonly found in manufacturing 
environments. Therefore, applications using ACA 
Services need not be concerned about network 
communications. 

ACA Services is supported on the OpenVMS, 
Microsoft Windows, Macintosh, and SunOS operat- 
ing systems, the most often used platforms in this 
application space. In fact, ACA Services is the only 
application integration mechanism currently avail- 
able on all these platforms. Moreover, ACA Services 
supports the kind of asynchronous services 
required by DEC (^aGlance. 

Although it provides many important compo- 
nents of the required integration service, ACA 
Services does not completely solve the integration 
problem. ACA Services is a tool intended to be used 
to integrate applications; it does not define the data 
model nor does it define the set of services that 
applications are to provide. Application integrators 
are expected to define (1) the classes of applica- 
tions that provide sets of services, (2) the services, 
and (3) the meaning and type of data to be 
exchanged by applications using the services. 

DEC @aGlance Software: 
The Tool Kit and Add-ins 

As shown in the DEC C^aGlance component diagram 
in Figure 5, DEC ©aGlance software uses ACA 
Services as a basic application integration facility. 
Above ACA Services, DEC ©aGlance adds definitions 
of a class of manufacturing data server applications 
(servers), a set of definitions of the services pro- 
vided by the servers, and definitions of the data ref- 
erence model. 

ACA Services provides a general capability ro 
integrate sets of applications. DEC ©aGlance soft- 
ware provides a set of routines that are specifically 
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Figure 5 DEC, @aG lance Components 



designed to simplify the implementation of the set 
of services that DEC ©aGlance supports. For server 
applications, DEC ©aGlance software supplies a set 
of callback points, as well as callable routines for 
declaring callbacks, filtering strings, and support- 
ing monitoring activities. For client applications, 
DEC ©aGlance software supplies a set of callable 
routines for requesting each of the defined ser- 
vices, as wel l as cal I back points in support of moni- 
tor updates. 

The DEC @aGlance server library also supports 
a test connectivity capability used to verify that an 
interapplication relationship can be established to 
the server application. This capability simplifies 
the diagnosis of problems encountered during both 
server development and client-server installation. 

To reduce dependence upon properly written 
server code, the test connectivity capability oper- 
ates entirely within the library. Thus, once a server 
calls the DEC ©a Glance initialization routine, and 
if the server is still running, this service should 
function properly in response to requests from 



DEC ©aGlance clients. Proper functioning includes 
verifying the installation and configuration of the 
network and of the ACA Services and DEC ©aGlance 
run-time components of the systems on which the 
client and server applications reside. 

Software add-ins, i.e., extensions, for two pop- 
ular spreadsheet applications, Lotus 1-2-3 for 
Windows and Microsoft Excel for Windows, are 
also DEC @aG lance products. These add-ins allow 
users of the spreadsheets to request data from man- 
ufacturing data servers by means of the spread- 
sheets' macro facilities. The add-ins provide a 
dialog box to guide untrained users through the 
process of constructing a DEC ©aGlance macro. 
Once built, a macro can be executed one or more 
times, modified if necessary, and saved in a work- 
sheet for reuse at some other time. 

Tool Kit 

The tool kit was developed to encourage the rapid 
and successful development of DEC @aGlance appli- 
cations by third parties. Successful applications are 
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those that intemperate with other DEC ©aGlance 
applications upon delivery to a customer site with 
no additional coding, no application recompila- 
tion, and no application rebuilding. 
The key components of the tool kit are 

■ A DEC ©aGlance client or server I ibrary 

■ Example code 

■ ACA Services definition files for the DEC ©aGlance 
class and methods 

■ Simple test facilities 

■ The DEC ©aGlance Programmer's Guide 10 

The ACA Services definition files contain the 
information required to define the manufacturing 
data server class and the services that members of 
the class support. Supplying the definitions in this 
form ensures strict consistency among all server 
and client developers with regard to these defi- 
nitions. The routines in the DEC Radiance client 
and server libraries use these definitions. The 
DEC ©aGlance libraries contain all the code required 
to establish and maintain an ACA Services session. 

Server Applications 

A server application built with the tool kit has three 
major components: an initialization section, the con- 
trol system-specific section, and the DEC ©aGlance 
section. The initialization section simply declares 
the server s name to the DEC ©aGlance application, 
declares a set of callback points, and enters a dis- 
patch loop. The server name is the name that client 
applications can use to interact with this server. 
The callback points are the code entry points to 
which DEC @ aGlance dispatches in response to the 
receipt of service requests from the client applica- 
tions. For a server, callback points exist for the fol- 
lowing services: 

■ Get a 1 ist of tag names 

■ Get a list of attribute names 

■ Get a list of data point values 

■ Get a table of data point values 

■ Put a 1 ist of data point val ues 

■ Put a table of data point values 

■ Get a table of historical values 

■ Put a list of historical values 

■ Register a monitor request 



■ Cancel a monitor request 

■ Initiate a session 

■ Terminate a session 

■ Execute a server-specific request 

■ Terminate the server 

The control system-specific section consists of 
code modules that execute calls to the control sys- 
tem application programming interface (API). These 
modules have to convert parameters to and from 
the DEC ©aGlance format and the control system- 
specific format. The entry point of each module is 
declared as a callback point during initialization. 

In addition, callable routines are provided for 
sending monitor updates and for session manage- 
ment. The DEC ©aGlance section of the server is 
contained entirely within a library of callable 
server routines. This section handles all interac- 
tions with ACA Services, including server registra- 
tion and session management. It also handles the 
dispatch of incoming requests to the callback rou- 
tines and a number of housekeeping tasks for 
which each server developer would otherwise 
have to develop and implement solutions. The 
DEC ©aGlance section also responds to test con- 
nectivity requests. 

Almost all vendors of manufacturing systems 
have applications that execute calls to the control 
system API, but such applications are typically 
driven off a command language or menu interface. 
Conversion of these applications to a DEC ©aGlance 
server is relatively easy; some vendors have created 
a simple DEC ©aGlance server in as little time as 
one day. 

Clie n t Applica Hons 

The typical DEC ©aGlance client application is built 
on an existing desktop tool. Desktop tools provide 
a user interface for performing some class of 
generic function such as decision support, statisti- 
cal analysis, quality control, or production schedul- 
ing. Other types of applications that could make 
use of process data, such as report generators, 
batch schedulers, and maintenance tracking sys- 
tems, can also provide the basis of DEC ©aGlance 
client applications. Adding DEC ©aGlance support 
to an existing tool allows the user to treat data from 
DEC ©aGlance manufacturing data servers like data 
entered manually or from other data sources. 

A DEC ©aGlance client application incorporates 
the DEC ©aGlance client routine library, which 
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provides callable routines for initialization and for 
each of the following DEC Radiance services: 

■ Get a list of tag names 

■ Get a list of attribute names 

■ Get a list of data point values 

■ Get a table of data point values 

■ Put a list of data point values 

■ Put a table of data point values 

■ Get a table of historical values 

■ Put a list of historical values 

■ Initiate a monitor request 

■ Cancel a monitor request 

■ Initiate a session 

■ Terminate a session 

■ Execute a server-specific request 

■ Terminate the server 

■ Terminate the client 

In addition, support routines help monitor updates. 

To support the DEC ©aGlance monitoring capa- 
bility, a client application must have some server 
characteristics. Once a monitoring request has 
been initiated, the server issues monitor update 
requests when the monitoring condition is satis- 
fied. The monitor update requests are received by 
the client application using the same callback 
mechanism that the server uses when servicing 
client requests. 

A typical client calls the DEC @aGlance initializa- 
tion routine and then continues to perform its nor- 
mal functions. When a DEC ©aGlance service is 
requested through the user interface or other 



mechanism, the application simply formats the 
request and calls the appropriate DEC @aGlance 
service request routine. Upon completion of the 
routine, status (and if requested, data) is returned 
from the server application. If data is returned that 
is to be further processed by the client application, 
the application moves the data to its workspace in 
preparation for additional processing. 

DEC ©aGlance Lotus 1-2-3 for 
Windows and Microsoft Excel Add- ins 
Whereas most manufacturing control systems pro- 
vide a callable library that allows the development 
of applications that access the data in the system, 
some desktop tool applications have mechanisms 
that allow for extension of their capabilities in the 
field. Spreadsheet applications such as Lotus 1-2-3 
and Microsoft Excel support the use of add-in mod- 
ules to add external functions and external macro 
capabilities. Add-ins for these two spreadsheets are 
available as DEC @aGlance software products. 

With the add-ins, spreadsheet users can access 
most DEC @aGlance services and thus can 

■ Fill a range of cells with a list of tag names from 
a server 

■ Fill a range of cells with a list of attribute names 
associated with a range of tag names in a server 

■ Fill a range of cells with a list of data point values 

■ Fill a range of cells with a table of data point 
values, as shown in Figure 6 

■ Write a list of data point values to a server 

■ Write a table of data point values to a server 

■ Fill a range of cells with a table of historical 
values for a specific time interval 

■ Write a list of historical values 
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Figure 6 A Table of Data Point Values in a Spreadsheet 
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The interface for the add-ins was designed to sup- 
port ad hoc inquiries. A dialog box guides the end 
user through the process of supplying the approp- 
riate parameters for a selected function. Where 
appropriate, defaults are suggested based upon the 
previous inquiry. 

Summary 

DEC @aGlance software has been specifically 
designed to make it easy for users of desktop tools 
to access, explore, and analyze data from dis- 
tributed control systems, supervisory control sys- 
tems, and other common systems used to run 
manufacturing processes. An analysis of the infor- 
mation environment and the ways in which end 
users want to access the data led to the refinement 
of the architectural requirements. The analysis also 
led to the decision to use ACA Services as the appro- 
priate mechanism for integrating desktop and man- 
ufacturing control applications. The creation of a 
usage model and rapid deployment of prototypes 
were instrumental in the analysis. To promote 
widespread availability of plug-compatible appli- 
cations that use DEC @aGlance, a developer's tool 
kit was created. The tool kit contains libraries of 
DEC @aGlance routines that both simplify and 
encourage proper and consistent usage of ACA 
Services to integrate DEC @aGlance applications. 

DEC @aGlance add-ins for the popular spread- 
sheet programs Lotus 1-2-3 for Windows and 
Microsoft Excel for Windows were developed also. 
With the add-in, users can interactively explore 
data in plant manufacturing control systems from 
within a familiar spreadsheet, as well as write 
reusable worksheet macros for performing 
repeated tasks like report generation. 
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